Re: [ceph-users] rbd map hangs on Ceph Cluster

2014-05-27 Thread Ilya Dryomov
On Tue, May 27, 2014 at 9:04 PM, Sharmila Govind wrote: > Hi, > > I am setting up a ceph cluster for some experimentation. The cluster is > setup successfully. But, When I try running rbd map on the host, the kernel > crashes(system hangs) and I need to do a hard reset for it to recover. Below >

Re: [ceph-users] question about feature set mismatch

2014-06-06 Thread Ilya Dryomov
On Thu, Jun 5, 2014 at 10:38 PM, Igor Krstic wrote: > Hello, > > dmesg: > [ 690.181780] libceph: mon1 192.168.214.102:6789 feature set mismatch, my > 4a042a42 < server's 504a042a42, missing 50 > [ 690.181907] libceph: mon1 192.168.214.102:6789 socket error on read > [ 700.190342] libcep

Re: [ceph-users] question about feature set mismatch

2014-06-06 Thread Ilya Dryomov
On Fri, Jun 6, 2014 at 4:34 PM, Kenneth Waegeman wrote: > > - Message from Igor Krstic - >Date: Fri, 06 Jun 2014 13:23:19 +0200 >From: Igor Krstic > Subject: Re: [ceph-users] question about feature set mismatch > To: Ilya Dryomov > Cc: ceph

Re: [ceph-users] feature set mismatch, "missing 20000000000"

2014-06-06 Thread Ilya Dryomov
On Fri, Jun 6, 2014 at 5:35 PM, Bryan Wright wrote: > Hi folks, > > Thanks to Sage Weil's advice, I fixed my "TMAP2OMAP" problem by just > restarting the osds, but now I'm running into the following cephfs problem. > When I try to mount the filesystem, I get errors like the following: > >

Re: [ceph-users] rbdmap issue

2014-06-06 Thread Ilya Dryomov
On Fri, Jun 6, 2014 at 4:47 PM, Ignazio Cassano wrote: > Hi all, > I configured a ceph cluster firefly on ubuntu 12.04. > I also confiured a centos 6.5 client with ceph-0.80.1-2.el6.x86_64 > and kernel 3.14.2-1.el6.elrepo.x86_64 > On Centeos I am able to use rbd remote block devices but if I try t

Re: [ceph-users] rbdmap issue

2014-06-06 Thread Ilya Dryomov
On Fri, Jun 6, 2014 at 6:15 PM, Ignazio Cassano wrote: > I Ilya, no file 50-rbd.rules exist on my system. My guess would be that the upgrade went south. In order for the symlinks to be created that file should exist on the client (i.e. the system you run 'rbd map' on). As a temporary fix, you c

Re: [ceph-users] question about feature set mismatch

2014-06-08 Thread Ilya Dryomov
On Sun, Jun 8, 2014 at 11:27 AM, Igor Krstic wrote: > On Fri, 2014-06-06 at 17:40 +0400, Ilya Dryomov wrote: >> On Fri, Jun 6, 2014 at 4:34 PM, Kenneth Waegeman >> wrote: >> > >> > - Message from Igor Krstic - >> >Date: Fri, 06 Jun 201

Re: [ceph-users] rbd: add failed: (34) Numerical result out of range

2014-06-09 Thread Ilya Dryomov
On Mon, Jun 9, 2014 at 11:48 AM, wrote: > I was building a small test cluster and noticed a difference with trying > to rbd map depending on whether the cluster was built using fedora or > CentOS. > > When I used CentOS osds, and tried to rbd map from arch linux or fedora, > I would get "rbd: add

Re: [ceph-users] rbd snap protect error

2014-06-09 Thread Ilya Dryomov
On Mon, Jun 9, 2014 at 3:01 PM, Ignazio Cassano wrote: > Hi all, > I installed cep firefly and now I am playing with rbd snapshot. > I created a pool (libvirt-pool) with two images: > > libvirtimage1 (format 1) > image2 (format 2). > > When I try to protect the first image: > > rbd --pool libvirt-

Re: [ceph-users] question about feature set mismatch

2014-06-21 Thread Ilya Dryomov
On Fri, Jun 20, 2014 at 2:02 AM, Erik Logtenberg wrote: > Hi Ilya, > > Do you happen to know when this fix will be released? > > Is upgrading to a newer kernel (client side) still a solution/workaround > too? If yes, which kernel version is required? This fix is purely server-side, so no extra re

Re: [ceph-users] Firefly OSDs : set_extsize: FSSETXATTR: (22) Invalid argument

2014-06-24 Thread Ilya Dryomov
On Tue, Jun 24, 2014 at 12:02 PM, Florent B wrote: > Hi all, > > On 2 Firefly cluster, I have a lot of errors like this on my OSDs : > > 2014-06-24 09:54:39.088469 7fb5b8628700 0 > xfsfilestorebackend(/var/lib/ceph/osd/ceph-4) set_extsize: FSSETXATTR: > (22) Invalid argument > > Both are using XF

Re: [ceph-users] Firefly OSDs : set_extsize: FSSETXATTR: (22) Invalid argument

2014-06-24 Thread Ilya Dryomov
On Tue, Jun 24, 2014 at 1:15 PM, Florent B wrote: > On 06/24/2014 11:13 AM, Ilya Dryomov wrote: >> On Tue, Jun 24, 2014 at 12:02 PM, Florent B wrote: >>> Hi all, >>> >>> On 2 Firefly cluster, I have a lot of errors like this on my OSDs : >>> &

Re: [ceph-users] error mapping device in firefly

2014-07-04 Thread Ilya Dryomov
On Fri, Jul 4, 2014 at 11:48 AM, Xabier Elkano wrote: > Hi, > > I am trying to map a rbd device in Ubuntu 14.04 (kernel 3.13.0-30-generic): > > # rbd -p mypool create test1 --size 500 > > # rbd -p mypool ls > test1 > > # rbd -p mypool map test1 > rbd: add failed: (5) Input/output error > > and in

Re: [ceph-users] feature set mismatch after upgrade from Emperor to Firefly

2014-07-20 Thread Ilya Dryomov
On Sun, Jul 20, 2014 at 10:29 PM, Irek Fasikhov wrote: > Привет, Андрей. > > ceph osd getcrushmap -o /tmp/crush > crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new > ceph osd setcrushmap -i /tmp/crush.new > > Or > > update kernel 3.15. > > > 2014-07-20 20:19 GMT+04:00 Andrei Mikh

Re: [ceph-users] ceph firefly 0.80.4 unable to use rbd map and ceph fs mount

2014-07-21 Thread Ilya Dryomov
On Mon, Jul 21, 2014 at 1:58 PM, Wido den Hollander wrote: > On 07/21/2014 11:32 AM, 漆晓芳 wrote: >> >> Hi,all: >> I 'm dong tests with firefly 0.80.4,I want to test the performance >> with tools such as FIO,iozone,when I decided to test the rbd storage >> performance with fio,I ran commands on

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 11:44 AM, James Eckersall wrote: > Hi, > > I've had a fun time with ceph this week. > We have a cluster with 4 OSD (20 OSD's per) servers, 3 mons and a server > mapping ~200 rbd's and presenting cifs shares. > > We're using cephx and the export node has its own cephx auth k

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 12:37 PM, James Eckersall wrote: > Hi, > > The stacktraces are very similar. Here is another one with complete dmesg: > http://pastebin.com/g3X0pZ9E > > The rbd's are mapped by the rbdmap service on boot. > All our ceph servers are running Ubuntu 14.04 (kernel 3.13.0-30-ge

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-07-31 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 12:08 AM, Larry Liu wrote: > I'm on ubuntu 12.04 kernel 3.2.0-53-generic. Just deployed ceph 0.80.5. > Creating pools & images work fine, but getting this error when try to map a > rbi image or mount cephfs: > rbd: add failed: (5) Input/output error > > I did try flipping

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 12:29 AM, German Anders wrote: > Hi Ilya, > I think you need to upgrade the kernel version of that ubuntu server, > I've a similar problem and after upgrade the kernel to 3.13 the problem was > resolved successfully. Ilya doesn't need to upgrade anything ;) Larry, if

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 4:05 PM, Gregory Farnum wrote: > We appear to have solved this and then immediately re-broken it by > ensuring that the userspace daemons will set a new required feature > bit if there are any EC rules in the OSDMap. I was going to say > there's a ticket open for it, but I c

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 8:34 PM, Larry Liu wrote: > oot@u12ceph02:~# rbd map foo --pool rbd --name client.admin -m u12ceph01 -k > /etc/ceph/ceph.client.admin.keyring > rbd: add failed: (5) Input/output error > dmesg shows these right away after the IO error: > [ 461.010895] libceph: mon0 10.190.1

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 4:22 PM, Ilya Dryomov wrote: > On Fri, Aug 1, 2014 at 4:05 PM, Gregory Farnum wrote: >> We appear to have solved this and then immediately re-broken it by >> ensuring that the userspace daemons will set a new required feature >> bit if there are any E

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 10:06 PM, Ilya Dryomov wrote: > On Fri, Aug 1, 2014 at 4:22 PM, Ilya Dryomov wrote: >> On Fri, Aug 1, 2014 at 4:05 PM, Gregory Farnum wrote: >>> We appear to have solved this and then immediately re-broken it by >>> ensuring that the users

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 10:32 PM, Larry Liu wrote: > cruhmap file is attached. I'm running kernel 3.13.0-29-generic after another > person suggested. But the kernel upgrade didn't fix anything for me. Thanks! So there are two problems. First, you either have erasure pools or had them in the past

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-02 Thread Ilya Dryomov
On Sat, Aug 2, 2014 at 1:41 AM, Christopher O'Connell wrote: > So I've been having a seemingly similar problem and while trying to follow > the steps in this thread, things have gone very south for me. Show me where in this thread have I said to set tunables to optimal ;) optimal (== firefly for

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-02 Thread Ilya Dryomov
On Sat, Aug 2, 2014 at 10:03 PM, Christopher O'Connell wrote: > On Sat, Aug 2, 2014 at 6:27 AM, Ilya Dryomov > wrote: >> >> On Sat, Aug 2, 2014 at 1:41 AM, Christopher O'Connell >> wrote: >> > So I've been having a seemingly similar problem and wh

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-04 Thread Ilya Dryomov
On Sun, Aug 3, 2014 at 2:04 AM, Christopher O'Connell wrote: > To be more clear on my question, we currently use ELRepo for those rare > occasions when we need a 3.x kernel on centos. Are you aware of anyone > maintaining a 3.14 kernel. The fix is not in stable yet, and won't be for the next two-

Re: [ceph-users] Misdirected client messages

2014-09-03 Thread Ilya Dryomov
On Thu, Sep 4, 2014 at 12:18 AM, Maros Vegh wrote: > Thank's for your reply. > > We are experiencing these errors on two clusters. > The clusters are running firefly 0.80.5 on debian wheezy. > The clients are running firefly 0.80.4 on debian wheezy. > > On all monitors the parameter: > mon osd all

Re: [ceph-users] Misdirected client messages

2014-09-04 Thread Ilya Dryomov
On Thu, Sep 4, 2014 at 12:45 AM, Maros Vegh wrote: > The ceph fs is mounted via the kernel client. > The clients are running on this kernel: > 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u1 x86_64 GNU/Linux 3.2 is a pretty old kernel. Can you try it on a newer kernel, say 3.13 or 3.14? Is there an

Re: [ceph-users] NAS on RBD

2014-09-09 Thread Ilya Dryomov
On Tue, Sep 9, 2014 at 12:33 PM, Christian Balzer wrote: > > Hello, > > On Tue, 9 Sep 2014 17:05:03 +1000 Blair Bethwaite wrote: > >> Hi folks, >> >> In lieu of a prod ready Cephfs I'm wondering what others in the user >> community are doing for file-serving out of Ceph clusters (if at all)? >> >>

Re: [ceph-users] Best practices on Filesystem recovery on RBD block volume?

2014-09-10 Thread Ilya Dryomov
On Wed, Sep 10, 2014 at 2:45 PM, Keith Phua wrote: > Hi Andrei, > > Thanks for the suggestion. > > But a rbd volume snapshots may only work if the filesystem is in a > consistent state, which mean no IO during snapshotting. With cronjob > snapshotting, usually we have no control over client doing

Re: [ceph-users] writing to rbd mapped device produces hang tasks

2014-09-14 Thread Ilya Dryomov
On Sun, Sep 14, 2014 at 3:07 PM, Andrei Mikhailovsky wrote: > To answer my own question, I think I am getting 8818 bug - > http://tracker.ceph.com/issues/8818. The solution seems to be to upgrade to > the latest 3.17 kernel brunch. Or 3.16.3 when it comes out. Thanks, Ilya _

Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-23 Thread Ilya Dryomov
On Fri, Sep 19, 2014 at 11:22 AM, Micha Krause wrote: > Hi, > >> I have build an NFS Server based on Sebastiens Blog Post here: >> >> http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/ >> >> Im using Kernel 3.14-0.bpo.1-amd64 on Debian wheezy, the host is a VM on >> Vmware. >> >> Using rsyn

Re: [ceph-users] rbd: I/O Errors in low memory situations

2015-02-19 Thread Ilya Dryomov
On Fri, Feb 20, 2015 at 2:21 AM, Mike Christie wrote: > On 02/18/2015 06:05 PM, "Sebastian Köhler [Alfahosting GmbH]" wrote: >> Hi, >> >> yesterday we had had the problem that one of our cluster clients >> remounted a rbd device in read-only mode. We found this[1] stack trace >> in the logs. We in

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Thu, Mar 5, 2015 at 8:17 PM, Nick Fisk wrote: > I’m seeing a strange queue depth behaviour with a kernel mapped RBD, librbd > does not show this problem. > > > > Cluster is comprised of 4 nodes, 10GB networking, not including OSDs as test > sample is small so fits in page cache. What do you me

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Fri, Mar 6, 2015 at 7:27 PM, Nick Fisk wrote: > Hi Somnath, > > I think you hit the nail on the head, setting librbd to not use TCP_NODELAY > shows the same behaviour as with krbd. That's why I asked about the kernel version. TCP_NODELAY is enabled by default since 4.0-rc1, so if you are up

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Fri, Mar 6, 2015 at 9:52 PM, Alexandre DERUMIER wrote: > Hi, > > does somebody known if redhat will backport new krbd features (discard, > blk-mq, tcp_nodelay,...) to the redhat 3.10 kernel ? Yes, all of those will be backported. discard is already there in rhel7.1 kernel. Thanks,

Re: [ceph-users] CephFS: stripe_unit=65536 + object_size=1310720 => pipe.fault, server, going to standby

2015-03-11 Thread Ilya Dryomov
On Wed, Mar 11, 2015 at 1:21 PM, LOPEZ Jean-Charles wrote: > Hi Florent > > What are the « rules » for stripe_unit & object_size ? -> stripe_unit * > stripe_count = object_size > > So in your case set stripe_unit = 2 > > JC > > > On 11 Mar 2015, at 19:59, Florent B wrote: > > Hi all, > > I'm test

Re: [ceph-users] problem with rbd map

2015-03-12 Thread Ilya Dryomov
On Thu, Mar 12, 2015 at 3:33 PM, Marc Boisis wrote: > I’m trying to create my first ceph disk from a client named bjorn : > > [ceph@bjorn ~]$ rbd create foo --size 512000 -m helga -k > /etc/ceph/ceph.client.admin.keyring > [ceph@bjorn ~]$ sudo rbd map foo --pool pool_ulr_1 --name client.admin -m

Re: [ceph-users] problem with rbd map

2015-03-12 Thread Ilya Dryomov
On Thu, Mar 12, 2015 at 3:33 PM, Marc Boisis wrote: > I’m trying to create my first ceph disk from a client named bjorn : > > [ceph@bjorn ~]$ rbd create foo --size 512000 -m helga -k > /etc/ceph/ceph.client.admin.keyring > [ceph@bjorn ~]$ sudo rbd map foo --pool pool_ulr_1 --name client.admin -m

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Ilya Dryomov
On Wed, Apr 8, 2015 at 7:36 PM, Lionel Bouton wrote: > On 04/08/15 18:24, Jeff Epstein wrote: >> Hi, I'm having sporadic very poor performance running ceph. Right now >> mkfs, even with nodiscard, takes 30 mintes or more. These kind of >> delays happen often but irregularly .There seems to be no c

Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-09 Thread Ilya Dryomov
On Wed, Apr 8, 2015 at 5:25 PM, Shawn Edwards wrote: > We've been working on a storage repository for xenserver 6.5, which uses the > 3.10 kernel (ug). I got the xenserver guys to include the rbd and libceph > kernel modules into the 6.5 release, so that's at least available. > > Where things go

Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-13 Thread Ilya Dryomov
On Mon, Apr 13, 2015 at 10:18 PM, Shawn Edwards wrote: > Here's a vmcore, along with log files from Xen's crash dump utility. > > https://drive.google.com/file/d/0Bz8b7ZiWX00AeHRhMjNvdVNLdDQ/view?usp=sharing > > Let me know if we can help more. > > On Fri, Apr 1

Re: [ceph-users] rbd performance problem on kernel 3.13.6 and 3.18.11

2015-04-14 Thread Ilya Dryomov
On Tue, Apr 14, 2015 at 6:24 AM, yangruifeng.09...@h3c.com wrote: > Hi all! > > > > I am testing rbd performance based on kernel rbd dirver, when I compared the > result of the kernel 3.13.6 with 3.18.11, my head gets so confused. > > > > look at the result, down by a third. > > > > 3.13.6 IOPS >

Re: [ceph-users] hammer (0.94.1) - still getting feature set mismatch for cephfs mount requests

2015-04-20 Thread Ilya Dryomov
On Mon, Apr 20, 2015 at 1:33 PM, Nikola Ciprich wrote: > Hello, > > I'm quite new to ceph, so please forgive my ignorance. > Yesterday, I've deployed small test cluster (3 nodes, 2 SATA + 1 SSD OSD / > node) > > I enabled MDS server and created cephfs data + metadata pools and created > filesyste

Re: [ceph-users] hammer (0.94.1) - still getting feature set mismatch for cephfs mount requests

2015-04-20 Thread Ilya Dryomov
On Mon, Apr 20, 2015 at 2:10 PM, Nikola Ciprich wrote: > Hello Ilya, >> Have you set your crush tunables to hammer? > > I've set crush tunables to optimal (therefore I guess they got set > to hammer). > > >> >> Your crushmap has straw2 buckets (alg straw2). That's going to be >> supported in 4.1

Re: [ceph-users] XFS extsize

2015-04-21 Thread Ilya Dryomov
On Tue, Apr 21, 2015 at 11:49 AM, Stefan Priebe - Profihost AG wrote: > Hi, > > while running firefly i've seen that each osd log prints: > 2015-04-21 10:24:49.498048 7fa9e925d780 0 > xfsfilestorebackend(/ceph/osd.49/) detect_feature: extsize is disabled > by conf > > The firefly release notes V

Re: [ceph-users] XFS extsize

2015-04-21 Thread Ilya Dryomov
On Tue, Apr 21, 2015 at 3:43 PM, Ilya Dryomov wrote: > On Tue, Apr 21, 2015 at 11:49 AM, Stefan Priebe - Profihost AG > wrote: >> Hi, >> >> while running firefly i've seen that each osd log prints: >> 2015-04-21 10:24:49.498048 7fa9e925d780 0

Re: [ceph-users] 3.18.11 - RBD triggered deadlock?

2015-04-24 Thread Ilya Dryomov
On Fri, Apr 24, 2015 at 6:41 PM, Nikola Ciprich wrote: > Hello once again, > > I seem to have hit one more problem today: > 3 nodes test cluster, nodes running 3.18.1 kernel, > ceph-0.94.1, 3-replicas pool, backed by SSD osds. Does this mean rbd device is mapped on a node that also runs one or mo

Re: [ceph-users] 3.18.11 - RBD triggered deadlock?

2015-04-24 Thread Ilya Dryomov
On Fri, Apr 24, 2015 at 7:06 PM, Nikola Ciprich wrote: >> >> Does this mean rbd device is mapped on a node that also runs one or >> more osds? > yes.. I know it's not the best practice, but it's just test cluster.. >> >> Can you watch osd sockets in netstat for a while and describe what you >> are

Re: [ceph-users] 3.18.11 - RBD triggered deadlock?

2015-04-25 Thread Ilya Dryomov
On Sat, Apr 25, 2015 at 9:56 AM, Nikola Ciprich wrote: >> >> It seems you just grepped for ceph-osd - that doesn't include sockets >> opened by the kernel client, which is what I was after. Paste the >> entire netstat? > ouch, bummer! here are full netstats, sorry about delay.. > > http://nik.lb

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:46 AM, Florent B wrote: > Hi, > > I would like to know which kernel version is needed to mount CephFS on a > Hammer cluster ? > > And if we use 3.16 kernel of Debian Jessie, can we hope using CephFS for > a few next release without problem ? I would advice to run the lat

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 9:40 PM, Chad William Seys wrote: > Hi Florent, > Most likely Debian will release "backported" kernels for Jessie, as they > have for Wheezy. > E.g. Wheezy has had kernel 3.16 backported to it: > > https://packages.debian.org/search?suite=wheezy-backports&searchon=names&

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:25 PM, cwseys wrote: > HI Illya, > >> Any new features, development work and most of the enhancements are not >> backported. Only a selected bunch of bug fixes is. > > > Not sure what you are trying to say. > > Wheezy was released with kernel 3.2 and bugfixes are applied

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-07 Thread Ilya Dryomov
On Thu, May 7, 2015 at 10:20 PM, Vandeir Eduardo wrote: > Hi, > > when issuing rbd unmap command when there is no network connection with mons > and osds, the command hangs. Isn't there a option to force unmap even on > this situation? No, but you can Ctrl-C the unmap command and that should do i

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-08 Thread Ilya Dryomov
auses the > resource agent stop command to timeout and the node is fenced. > > On Thu, May 7, 2015 at 4:37 PM, Ilya Dryomov wrote: >> >> On Thu, May 7, 2015 at 10:20 PM, Vandeir Eduardo >> wrote: >> > Hi, >> > >> > when issuing rbd unmap command when t

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-08 Thread Ilya Dryomov
On Fri, May 8, 2015 at 3:59 PM, Ilya Dryomov wrote: > On Fri, May 8, 2015 at 1:18 PM, Vandeir Eduardo > wrote: >> This causes an annoying problem with rbd resource agent in pacemaker. In a >> situation where pacemaker needs to stop a rbd resource agent on a node where >

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-08 Thread Ilya Dryomov
On Fri, May 8, 2015 at 4:13 PM, Vandeir Eduardo wrote: > Wouldn't be better a configuration named (map|unmap)_timeout? Cause we are > talking about a map/unmap of a RBD device, not a mount/unmount of a file > system. The mount_timeout option is already there and is used in cephfs. We could certa

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-08 Thread Ilya Dryomov
On Fri, May 8, 2015 at 4:25 PM, Ilya Dryomov wrote: > On Fri, May 8, 2015 at 4:13 PM, Vandeir Eduardo > wrote: >> Wouldn't be better a configuration named (map|unmap)_timeout? Cause we are >> talking about a map/unmap of a RBD device, not a mount/unmount of a

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 8:37 PM, Chad William Seys wrote: > Hi Ilya and all, > Is it safe to use kernel 3.16.7 rbd with Hammer tunables? I've tried > this on a test Hammer cluster and the client seems to work fine. > I've also mounted cephfs on a Hammer cluster (and Hammer tunable

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 10:38 PM, Chad William Seys wrote: > Hi Ilya and all, > Thanks for explaining. > I'm confused about what "building" a crushmap means. > After running > #ceph osd crush tunables hammer > data migrated around the cluster, so something changed. >

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 11:16 PM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > What is the difference between straw and straw2 buckets? Should we consider > "upgrading" to straw2 buckets by dumping the CRUSH map and updating them? Well, straw bucket was supposed t

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 11:36 PM, Chad William Seys wrote: >> No, pools use crush rulesets. "straw" and "straw2" are bucket types >> (or algorithms). >> >> As an example, if you do "ceph osd crush add-bucket foo rack" on >> a cluster with firefly tunables, you will get a new straw bucket. The >>

Re: [ceph-users] Kernel Bug in 3.13.0-52

2015-05-13 Thread Ilya Dryomov
On Wed, May 13, 2015 at 11:20 PM, Daniel Takatori Ohara wrote: > Hello Lincoln, > > Thank's for the answer. I will be upgrade the kernel in clients. > > But, in the version 0.94.1 (hammer), the kernel is the same? Is the 3.16? Pay attention to the "or later" part of "v3.16.3 or later". If you ca

Re: [ceph-users] what's the difference between pg and pgp?

2015-05-21 Thread Ilya Dryomov
On Thu, May 21, 2015 at 12:12 PM, baijia...@126.com wrote: > Re: what's the difference between pg and pgp? pg-num is the number of PGs, pgp-num is the number of PGs that will be considered for placement, i.e. it's the pgp-num value that is used by CRUSH, not pg-num. For example, consider pg-num

Re: [ceph-users] krbd and blk-mq max queue depth=128?

2015-06-04 Thread Ilya Dryomov
On Wed, Jun 3, 2015 at 8:03 PM, Nick Fisk wrote: > > Hi All, > > > > Am I correct in thinking that in latest kernels, now that krbd is supported > via blk-mq, the maximum queue depth is now 128 and cannot be adjusted > > > > http://xo4t.mj.am/link/xo4t/jw0u7zr/1/VnVTVD2KMuL7gZiTD1iRXQ/aHR0cHM6Ly9

Re: [ceph-users] rbd format v2 support

2015-06-07 Thread Ilya Dryomov
On Fri, Jun 5, 2015 at 6:47 AM, David Z wrote: > Hi Ceph folks, > > We want to use rbd format v2, but find it is not supported on kernel 3.10.0 > of centos 7: > > [ceph@ ~]$ sudo rbd map zhi_rbd_test_1 > rbd: sysfs write failed > rbd: map failed: (22) Invalid argument > [ceph@ ~]$ dmesg |

Re: [ceph-users] rbd format v2 support

2015-06-09 Thread Ilya Dryomov
On Tue, Jun 9, 2015 at 5:52 AM, David Z wrote: > Hi Ilya, > > Thanks for the reply. I knew that v2 image can be mapped if using default > striping parameters without --stripe-unit or --stripe-count. > > It is just the rbd performance (IOPS & bandwidth) we tested hasn't met our > goal. We found at

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-10 Thread Ilya Dryomov
On Wed, Jun 10, 2015 at 3:23 PM, Dan van der Ster wrote: > Hi, > > I found something similar awhile ago within a VM. > http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-November/045034.html > I don't know if the change suggested by Ilya ever got applied. Yeah, it got applied. We did

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-10 Thread Ilya Dryomov
On Wed, Jun 10, 2015 at 2:47 PM, Nick Fisk wrote: > Hi, > > Using Kernel RBD client with Kernel 4.03 (I have also tried some older > kernels with the same effect) and IO is being split into smaller IO's which > is having a negative impact on performance. > > cat /sys/block/sdc/queue/max_hw_sectors

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-10 Thread Ilya Dryomov
On Wed, Jun 10, 2015 at 6:18 PM, Nick Fisk wrote: >> -Original Message- >> From: Ilya Dryomov [mailto:idryo...@gmail.com] >> Sent: 10 June 2015 14:06 >> To: Nick Fisk >> Cc: ceph-users >> Subject: Re: [ceph-users] krbd splitting large IO's into sma

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-10 Thread Ilya Dryomov
On Wed, Jun 10, 2015 at 7:04 PM, Nick Fisk wrote: >> > >> -Original Message- >> > >> From: Ilya Dryomov [mailto:idryo...@gmail.com] >> > >> Sent: 10 June 2015 14:06 >> > >> To: Nick Fisk >> > >> Cc: ceph-use

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-11 Thread Ilya Dryomov
On Wed, Jun 10, 2015 at 7:07 PM, Ilya Dryomov wrote: > On Wed, Jun 10, 2015 at 7:04 PM, Nick Fisk wrote: >>> > >> -Original Message- >>> > >> From: Ilya Dryomov [mailto:idryo...@gmail.com] >>> > >> Sent: 10 June 2015 14:06 >&g

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-11 Thread Ilya Dryomov
On Thu, Jun 11, 2015 at 2:23 PM, Ilya Dryomov wrote: > On Wed, Jun 10, 2015 at 7:07 PM, Ilya Dryomov wrote: >> On Wed, Jun 10, 2015 at 7:04 PM, Nick Fisk wrote: >>>> > >> -Original Message- >>>> > >> From: Ilya Dryomov [mailto:idr

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-11 Thread Ilya Dryomov
On Thu, Jun 11, 2015 at 5:30 PM, Nick Fisk wrote: >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Ilya Dryomov >> Sent: 11 June 2015 12:33 >> To: Nick Fisk >> Cc: ceph-users >> Subject: Re: [

Re: [ceph-users] kernel 3.18 io bottlenecks?

2015-06-24 Thread Ilya Dryomov
On Wed, Jun 24, 2015 at 5:55 PM, Nick Fisk wrote: > That kernel probably has the bug where tcp_nodelay is not enabled. That is > fixed in Kernel 4.0+, however also in 4.0 blk-mq was introduced which brings > two other limitations:- > > > > 1. Max queue depth of 128 There will be an option i

Re: [ceph-users] kernel 3.18 io bottlenecks?

2015-06-24 Thread Ilya Dryomov
On Wed, Jun 24, 2015 at 8:38 PM, Stefan Priebe wrote: > > Am 24.06.2015 um 16:55 schrieb Nick Fisk: >> >> That kernel probably has the bug where tcp_nodelay is not enabled. That >> is fixed in Kernel 4.0+, however also in 4.0 blk-mq was introduced which >> brings two other limitations:- > > > blk-

Re: [ceph-users] kernel 3.18 io bottlenecks?

2015-06-25 Thread Ilya Dryomov
On Wed, Jun 24, 2015 at 10:29 PM, Stefan Priebe wrote: > > Am 24.06.2015 um 19:53 schrieb Ilya Dryomov: >> >> On Wed, Jun 24, 2015 at 8:38 PM, Stefan Priebe >> wrote: >>> >>> >>> Am 24.06.2015 um 16:55 schrieb Nick Fisk: >>>> >

Re: [ceph-users] 'rbd map' inside a docker container

2015-06-25 Thread Ilya Dryomov
On Thu, Jun 25, 2015 at 5:11 PM, Jan Safranek wrote: > I need to map an existing RBD into a docker container. I'm running > Fedora 21 both as the host and inside the container (i.e. the kernel > matches ceph user space tools) and I get this error: > > $ rbd map foo > rbd: add failed: (22) Invalid

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-26 Thread Ilya Dryomov
On Fri, Jun 26, 2015 at 3:17 PM, Z Zhang wrote: > Hi Ilya, > > I am seeing your recent email talking about krbd splitting large IO's into > smaller IO's, see below link. > > https://www.mail-archive.com/ceph-users@lists.ceph.com/msg20587.html > > I just tried it on my ceph cluster using kernel 3.1

Re: [ceph-users] kernel 3.18 io bottlenecks?

2015-06-27 Thread Ilya Dryomov
On Sat, Jun 27, 2015 at 6:20 PM, Stefan Priebe wrote: > Dear Ilya, > > Am 25.06.2015 um 14:07 schrieb Ilya Dryomov: >> >> On Wed, Jun 24, 2015 at 10:29 PM, Stefan Priebe >> wrote: >>> >>> >>> Am 24.06.2015 um 19:53 schrieb Ilya Dryomov:

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-30 Thread Ilya Dryomov
On Tue, Jun 30, 2015 at 8:30 AM, Z Zhang wrote: > Hi Ilya, > > Thanks for your explanation. This makes sense. Will you make max_segments to > be configurable? Could you pls point me the fix you have made? We might help > to test it. [PATCH] rbd: bump queue_max_segments on ceph-devel. Thanks,

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Ilya Dryomov
On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: > I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are about the same. > > fuse: > Files=191, Tests=1964, 60 wallclock secs ( 0.43 usr 0.08 sys + 1.16 cusr > 0.65 csys = 2.32 CPU) > > kernel: > Files=191, Tests=2286, 61 wallclock se

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 6:18 PM, Tuomas Juntunen wrote: > Hi > > > > We are experiencing the following > > > > - Hammer 0.94.2 > > - Ubuntu 14.04.1 > > - Kernel 3.16.0-37-generic > > - 40TB NTFS disk mounted through RBD > > > > > > First 50GB goes fine, but then

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 6:58 PM, Tuomas Juntunen wrote: > Hi > > That's the only error what comes from this, there's nothing else. Is it repeated? Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 7:37 PM, Tuomas Juntunen wrote: > Couple of times the same and and the whole rbd mount is hanged > > can't df, or ls. > > umount -l and rbd unmap takes 10-20mins to get rid of it and then I can mount > again and try the transfer > > I have 20TB of stuff that needs to be cop

Re: [ceph-users] RBD mounted image on linux server kernel error and hangs the device

2015-07-05 Thread Ilya Dryomov
On Sun, Jul 5, 2015 at 7:57 PM, Tuomas Juntunen wrote: > Hi > > Is there any other kernel that would work? Anyone else had this kind of > problem with rbd map? Well, 4.1 is the latest and therefore the easiest to debug, assuming this is a kernel client problem. What is the output of /sys/kernel

Re: [ceph-users] Cannot map rbd image with striping!

2015-07-09 Thread Ilya Dryomov
On Wed, Jul 8, 2015 at 11:02 PM, Hadi Montakhabi wrote: > Thank you! > Is striping supported while using CephFS? Yes. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph

Re: [ceph-users] CephFS kernel client reboots on write

2015-07-13 Thread Ilya Dryomov
On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař wrote: > Hi all, > > I think I found a bug in cephfs kernel client. > When I create directory in cephfs and set layout to > > ceph.dir.layout="stripe_unit=1073741824 stripe_count=1 > object_size=1073741824 pool=somepool" > > attepmts to write larger file

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-23 Thread Ilya Dryomov
On Thu, Jul 23, 2015 at 4:23 PM, Vedran Furač wrote: > On 07/23/2015 03:20 PM, Gregory Farnum wrote: >> On Thu, Jul 23, 2015 at 1:17 PM, Vedran Furač wrote: >>> Hello, >>> >>> I'm having an issue with nginx writing to cephfs. Often I'm getting: >>> >>> writev() "/home/ceph/temp/44/94/1/119444

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-23 Thread Ilya Dryomov
On Thu, Jul 23, 2015 at 5:37 PM, Vedran Furač wrote: > On 07/23/2015 04:19 PM, Ilya Dryomov wrote: >> On Thu, Jul 23, 2015 at 4:23 PM, Vedran Furač wrote: >>> On 07/23/2015 03:20 PM, Gregory Farnum wrote: >>>> >>>> That's...odd. Are you using th

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-23 Thread Ilya Dryomov
On Thu, Jul 23, 2015 at 6:02 PM, Vedran Furač wrote: > 4118 writev(377, [{"\5\356\307l\361"..., 4096}, {"\337\261\17<\257"..., > 4096}, {"\211&;s\310"..., 4096}, {"\370N\372:\252"..., 4096}, > {"\202\311/\347\260"..., 4096}, ...], 33) = ? ERESTARTSYS (To be restarted) > 4118 --- SIGALRM (Alarm c

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-23 Thread Ilya Dryomov
On Thu, Jul 23, 2015 at 6:28 PM, Vedran Furač wrote: > On 07/23/2015 05:25 PM, Ilya Dryomov wrote: >> On Thu, Jul 23, 2015 at 6:02 PM, Vedran Furač wrote: >>> 4118 writev(377, [{"\5\356\307l\361"..., 4096}, {"\337\261\17<\257"..., >>> 4096}, {&

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-24 Thread Ilya Dryomov
On Thu, Jul 23, 2015 at 9:34 PM, Vedran Furač wrote: > On 07/23/2015 06:47 PM, Ilya Dryomov wrote: >> >> To me this looks like a writev() interrupted by a SIGALRM. I think >> nginx guys read your original email the same way I did, which is "write >> syscall

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-24 Thread Ilya Dryomov
On Fri, Jul 24, 2015 at 3:54 PM, Vedran Furač wrote: > On 07/24/2015 09:54 AM, Ilya Dryomov wrote: >> >> I don't know - looks like nginx isn't setting SA_RESTART, so it should >> be repeating the write()/writev() itself. That said, if it happens >> onl

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-24 Thread Ilya Dryomov
On Fri, Jul 24, 2015 at 4:29 PM, Ilya Dryomov wrote: > On Fri, Jul 24, 2015 at 3:54 PM, Vedran Furač wrote: >> On 07/24/2015 09:54 AM, Ilya Dryomov wrote: >>> >>> I don't know - looks like nginx isn't setting SA_RESTART, so it should >>> be repeati

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 9:17 AM, van wrote: > Hi, list, > > I found on the ceph FAQ that, ceph kernel client should not run on > machines belong to ceph cluster. > As ceph FAQ metioned, “In older kernels, Ceph can deadlock if you try to > mount CephFS or RBD client services on the same host th

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 11:19 AM, van wrote: > Hi, Ilya, > > Thanks for your quick reply. > > Here is the link http://ceph.com/docs/cuttlefish/faq/ , under the "HOW > CAN I GIVE CEPH A TRY?” section which talk about the old kernel stuff. > > By the way, what’s the main reason of using kerne

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 2:46 PM, van wrote: > Hi, Ilya, > > In the dmesg, there is also a lot of libceph socket error, which I think > may be caused by my stopping ceph service without unmap rbd. Well, sure enough, if you kill all OSDs, the filesystem mounted on top of rbd device will get stuck

Re: [ceph-users] which kernel version can help avoid kernel client deadlock

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 7:20 PM, van wrote: > >> On Jul 28, 2015, at 7:57 PM, Ilya Dryomov wrote: >> >> On Tue, Jul 28, 2015 at 2:46 PM, van wrote: >>> Hi, Ilya, >>> >>> In the dmesg, there is also a lot of libceph socket error, which I thi

  1   2   3   4   5   >