Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Jan Schermer
> On 08 Dec 2015, at 08:57, Benedikt Fraunhofer wrote: > > Hi Jan, > >> Doesn't look near the limit currently (but I suppose you rebooted it in the >> meantime?). > > the box this numbers came from has an uptime of 13 days > so it's one of the boxes that did survive yesterdays half-cluster-wi

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Tom Christensen
We have been seeing this same behavior on a cluster that has been perfectly happy until we upgraded to the ubuntu vivid 3.19 kernel. We are in the process of "upgrading" back to the 3.16 kernel across our cluster as we've not seen this behavior on that kernel for over 6 months and we're pretty str

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Benedikt Fraunhofer
Hi Tom, > We have been seeing this same behavior on a cluster that has been perfectly > happy until we upgraded to the ubuntu vivid 3.19 kernel. We are in the i can't recall when we gave 3.19 a shot but now that you say it... The cluster was happy for >9 months with 3.16. Did you try 4.2 or do y

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Tom Christensen
We didn't go forward to 4.2 as its a large production cluster, and we just needed the problem fixed. We'll probably test out 4.2 in the next couple months, but this one slipped past us as it didn't occur in our test cluster until after we had upgraded production. In our experience it takes about

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Benedikt Fraunhofer
Hi Tom, 2015-12-08 10:34 GMT+01:00 Tom Christensen : > We didn't go forward to 4.2 as its a large production cluster, and we just > needed the problem fixed. We'll probably test out 4.2 in the next couple unfortunately we don't have the luxury of a test cluster. and to add to that, we couldnt s

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Mykola Dvornik
The same thing happens to my setup with CentOS7.x + non-stock kernel (kernel-ml from elrepo). I was not happy with IOPS I got out of the stock CentOS7.x so I did the kernel upgrade and crashes started to happen until some of the OSDs become non-bootable at all. The funny thing is that I was no

[ceph-users] http://gitbuilder.ceph.com/

2015-12-08 Thread Xav Paice
Hi, Just wondering if there's a known issue with http://gitbuilder.ceph.com/ - if I go to several urls, e.g. http://gitbuilder.ceph.com/libapache-mod-fastcgi-deb-trusty-x86_64-basic, I get a 403. That's still the right place to get deb's, right? ___ cep

[ceph-users] OSD error

2015-12-08 Thread Dan Nica
Hi guys, Recently I installed ceph cluster version 9.2.0, and on my osd logs I see these errors: 2015-12-08 04:49:12.931683 7f42ec266700 -1 lsb_release_parse - pclose failed: (13) Permission denied 2015-12-08 04:49:12.955264 7f42ec266700 -1 lsb_release_parse - pclose failed: (13) Permission de

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Tom Christensen
We aren't running NFS, but regularly use the kernel driver to map RBDs and mount filesystems in same. We see very similar behavior across nearly all kernel versions we've tried. In my experience only very few versions of the kernel driver survive any sort of crush map change/update while somethin

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Tom Christensen
We run deep scrubs via cron with a script so we know when deep scrubs are happening, and we've seen nodes fail both during deep scrubbing and while no deep scrubs are occurring so I'm pretty sure its not related. On Tue, Dec 8, 2015 at 2:42 AM, Benedikt Fraunhofer wrote: > Hi Tom, > > 2015-12-0

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Ilya Dryomov
On Tue, Dec 8, 2015 at 10:57 AM, Tom Christensen wrote: > We aren't running NFS, but regularly use the kernel driver to map RBDs and > mount filesystems in same. We see very similar behavior across nearly all > kernel versions we've tried. In my experience only very few versions of the > kernel

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Tom Christensen
We haven't submitted a ticket as we've just avoided using the kernel client. We've periodically tried with various kernels and various versions of ceph over the last two years, but have just given up each time and reverted to using rbd-fuse, which although not super stable, at least doesn't hang t

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Tom Christensen
To be clear, we are also using format 2 RBDs, so we didn't really expect it to work until recently as it was listed as unsupported. We are under the understanding that as of 3.19 RBD format 2 should be supported. Are we incorrect in that understanding? On Tue, Dec 8, 2015 at 3:44 AM, Tom Christe

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Ilya Dryomov
On Tue, Dec 8, 2015 at 11:53 AM, Tom Christensen wrote: > To be clear, we are also using format 2 RBDs, so we didn't really expect it > to work until recently as it was listed as unsupported. We are under the > understanding that as of 3.19 RBD format 2 should be supported. Are we > incorrect in

[ceph-users] ceph new installation of ceph 0.9.2 issue and crashing osds

2015-12-08 Thread Kenneth Waegeman
Hi, I installed ceph 0.9.2 on a new cluster of 3 nodes, with 50 OSDs on each node (300GB disks, 96GB RAM) While installing, I got some issue that I even could not login as ceph user. So I increased some limits: security/limits.conf ceph- nproc 1048576 ceph

[ceph-users] CephFS Path restriction

2015-12-08 Thread Dennis Kramer (DT)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I'm trying to restrict clients to mount a specific path in CephFS. I've been using the official doc for this: http://docs.ceph.com/docs/master/cephfs/client-auth/ After setting these cap restrictions, the client can still mount and use all dire

Re: [ceph-users] Infernalis for Debian 8 armhf

2015-12-08 Thread Daleep Singh Bais
Hi, I tried following the steps as you had mentioned and I am stuck while building the package using dpkg-buildpackage -j4 with below mentioned error message : *Submodule path 'src/rocksdb': checked out 'dcdb0dd29232ece43f093c99220b0eea7ead51ff'** **Unable to checkout 'b0d1137d31e4b36b72ccae9c0a9

Re: [ceph-users] ceph-disk list crashes in infernalis

2015-12-08 Thread Loic Dachary
Hi Felix, Could you please ls -l /dev/cciss /sys/block/cciss*/ ? Thanks for being the cciss proxy in fixing this problem :-) Cheers On 07/12/2015 11:43, Loic Dachary wrote: > Thanks ! > > On 06/12/2015 17:50, Stolte, Felix wrote: >> Hi Loic, >> >> output is: >> >> /dev: >> insgesamt 0 >> crw--

Re: [ceph-users] CephFS Path restriction

2015-12-08 Thread John Spray
On Tue, Dec 8, 2015 at 1:43 PM, Dennis Kramer (DT) wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > > > Hi, > > I'm trying to restrict clients to mount a specific path in CephFS. > I've been using the official doc for this: > http://docs.ceph.com/docs/master/cephfs/client-auth/ > > Aft

Re: [ceph-users] ceph-disk list crashes in infernalis

2015-12-08 Thread Stolte, Felix
Hi Loic, glad to help. Thanks for fixing this problem := Output is: /dev/cciss: insgesamt 0 brw-rw 1 root disk 104, 0 Dez 2 17:02 c0d0 brw-rw 1 root disk 104, 1 Dez 2 17:02 c0d0p1 brw-rw 1 root disk 104, 2 Dez 2 17:02 c0d0p2 brw-rw 1 root disk 104, 5 Dez 2 17:02 c0d0p

Re: [ceph-users] ceph-disk list crashes in infernalis

2015-12-08 Thread Loic Dachary
I also need to confirm that the names that show in /sys/block/*/holders are with a ! (it would not make sense to me if they were not but ...) On 08/12/2015 15:05, Loic Dachary wrote: > Hi Felix, > > Could you please ls -l /dev/cciss /sys/block/cciss*/ ? > > Thanks for being the cciss proxy in f

Re: [ceph-users] CephFS Path restriction

2015-12-08 Thread Dennis Kramer (DT)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ah, that explains alot. Thank you. Yes, it was a bit confusing for which version it applied to. Awesome addition by the way, I like the path parameter! Cheers. On 12/08/2015 03:15 PM, John Spray wrote: > On Tue, Dec 8, 2015 at 1:43 PM, Dennis Kramer

Re: [ceph-users] ceph-disk list crashes in infernalis

2015-12-08 Thread Stolte, Felix
Yes, they do contain a "!" Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsit

Re: [ceph-users] http://gitbuilder.ceph.com/

2015-12-08 Thread Ken Dreyer
Yes, we've had to move all of our hardware out of the datacenter in Irvine, California to a new home in Raleigh, North Carolina. The backend server for gitbuilder.ceph.com had a *lot* of data and we were not able to sync all of it to an interim server in Raleigh before we had to unplug the old one.

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Scottix
I can confirm it seems to be kernels greater than 3.16, we had this problem where servers would lock up and had to perform restarts on a weekly basis. We downgraded to 3.16, since then we have not had to do any restarts. I did find this thread in the XFS forums and I am not sure if has been fixed

[ceph-users] ceph snapshost

2015-12-08 Thread Dan Nica
Hi guys, So from documentation I must stop the I/O before taking rbd snapshots, how do I do that or what does that mean ? do I have to unmount the rbd image ? -- Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.c

[ceph-users] Fwd: scrub error with ceph

2015-12-08 Thread Erming Pei
(Found no response from the current list, so forwarded to ceph-us...@ceph.com. ) Sorry if it's duplicated. Original Message Subject:scrub error with ceph Date: Mon, 7 Dec 2015 14:15:07 -0700 From: Erming Pei To: ceph-users@lists.ceph.com Hi, I found th

Re: [ceph-users] ceph new installation of ceph 0.9.2 issue and crashing osds

2015-12-08 Thread Brad Hubbard
Looks like it's failing to create a thread. Try setting kernel.pid_max to 4194303 in /etc/sysctl.conf Cheers, Brad - Original Message - > From: "Kenneth Waegeman" > To: ceph-users@lists.ceph.com > Sent: Tuesday, 8 December, 2015 10:45:11 PM > Subject: [ceph-users] ceph new installation

Re: [ceph-users] Ceph extras package support for centos kvm-qemu

2015-12-08 Thread Ken Dreyer
When we re-arranged the download structure for packages and moved everything from ceph.com to download.ceph.com, we did not carry ceph-extras over. The reason is that the packages there were unmaintained. The EL6 QEMU binaries were vulnerable to VENOM (CVE-2015-3456) and maybe other CVEs, and no u

Re: [ceph-users] OSD error

2015-12-08 Thread Brad Hubbard
+ceph-devel - Original Message - > From: "Dan Nica" > To: ceph-us...@ceph.com > Sent: Tuesday, 8 December, 2015 7:54:20 PM > Subject: [ceph-users] OSD error > Hi guys, > Recently I installed ceph cluster version 9.2.0, and on my osd logs I see > these errors: > 2015-12-08 04:49:12.93

[ceph-users] Ceph 9.2 fails to install in COS 7.1.1503: Report and Fix

2015-12-08 Thread Goncalo Borges
Hi Cephers This is just to report an issue (and a workaround) regarding dependencies in Centos 7.1.1503 Last week, I installed a couple of nodes and there were no issues with dependencies. This week, the installation of ceph rpm fails because it depends on gperftools-libs which, on its own,

Re: [ceph-users] Cannot create Initial Monitor

2015-12-08 Thread Aakanksha Pudipeddi-SSI
I am still unable to get past this issue. Could anyone help me out here? Thanks, Aakanksha From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Aakanksha Pudipeddi-SSI Sent: Thursday, December 03, 2015 8:08 PM To: ceph-users Subject: [ceph-users] Cannot create Initial Monitor

Re: [ceph-users] ceph snapshost

2015-12-08 Thread Yan, Zheng
On Wed, Dec 9, 2015 at 12:10 AM, Dan Nica wrote: > Hi guys, > > > > So from documentation I must stop the I/O before taking rbd snapshots, how > do I do that or what does that mean ? do I have to unmount > see fsfreeze(8) command > the rbd image ? > > > > -- > > Dan > > > __

Re: [ceph-users] Cannot create Initial Monitor

2015-12-08 Thread Varada Kari
Could you try starting the monitor manually and see what the error is? Like ceph-mon -i --cluster ceph &. Enable more logging (debug_mon). Varada From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Aakanksha Pudipeddi-SSI Sent: Wednesday, December 09, 2015 7:47 AM To: Aaka

Re: [ceph-users] rbd merge-diff error

2015-12-08 Thread Alex Gorbachev
Hi Josh, On Mon, Dec 7, 2015 at 6:50 PM, Josh Durgin wrote: > On 12/07/2015 03:29 PM, Alex Gorbachev wrote: > >> When trying to merge two results of rbd export-diff, the following error >> occurs: >> >> iss@lab2-b1:~$ rbd export-diff --from-snap autosnap120720151500 >> spin1/scrun1@autosnap12072

Re: [ceph-users] rbd merge-diff error

2015-12-08 Thread Josh Durgin
On 12/08/2015 10:44 PM, Alex Gorbachev wrote: Hi Josh, On Mon, Dec 7, 2015 at 6:50 PM, Josh Durgin mailto:jdur...@redhat.com>> wrote: On 12/07/2015 03:29 PM, Alex Gorbachev wrote: When trying to merge two results of rbd export-diff, the following error occurs:

Re: [ceph-users] ceph snapshost

2015-12-08 Thread Jan Schermer
You don't really *have* to stop I/O. In fact, I recommend you don't unless you have to. The reason why this is recommended is to minimize the risk of data loss because the snapshot will be in a very similiar state as if you suddenly lost power to the server. Obviously if you need to have the sam