Re: [ceph-users] Ceph Recovery

2016-05-17 Thread Gaurav Bafna
Is it a known issue and is it expected ? When as osd is marked out, the reweight becomes 0 and the PGs should get remapped , right ? I do see recovery after removing from crush map. Thanks Gaurav On Wed, May 18, 2016 at 12:08 PM, Lazuardi Nasution wrote: > Hi Gaurav, > > Not onnly marked out,

Re: [ceph-users] Ceph Recovery

2016-05-17 Thread Lazuardi Nasution
Hi Gaurav, Not onnly marked out, you need to remove it from crush map to make sure cluster do auto recovery. It seem taht the marked out OSD still appear on crush map calculation so it must be removed manually. You will see that there will be recovery process after you remove OSD from crush map.

[ceph-users] dense storage nodes

2016-05-17 Thread Blair Bethwaite
Hi all, What are the densest node configs out there, and what are your experiences with them and tuning required to make them work? If we can gather enough info here then I'll volunteer to propose some upstream docs covering this. At Monash we currently have some 32-OSD nodes (running RHEL7), tho

[ceph-users] lsof ceph-osd find many "can't identify protocol"

2016-05-17 Thread Dong Wu
Hi, cephers. I use lsof in my system, find a lot of "can't identify protocol", dose it mean socket descriptor leaks? ceph-osd 5389root 112u sock0,7 0t0 295880018 can't identify protocol ceph-osd 5389root 136u sock0

Re: [ceph-users] PG stuck incomplete after power failure.

2016-05-17 Thread Hein-Pieter van Braam
Hi, Thank you so much! This fixed my issue completely, minus one image that was apparently being uploaded while the rack lost power. Is there anything I can do to prevent this from happening in the future, or a way to detect this issue? I've looked online for an explanation of exactly what this

Re: [ceph-users] PG stuck incomplete after power failure.

2016-05-17 Thread Samuel Just
Try restarting the primary osd for that pg with osd_find_best_info_ignore_history_les set to true (don't leave it set long term). -Sam On Tue, May 17, 2016 at 7:50 AM, Hein-Pieter van Braam wrote: > Hello, > > Today we had a power failure in a rack housing our OSD servers. We had > 7 of our 30 to

Re: [ceph-users] OSD process doesn't die immediately after device disappears

2016-05-17 Thread Somnath Roy
Hi Marcel, FileStore doesn't subscribe for any such event from the device. Presently, it is relying on filesystem (for the FileStore assert) to return back error during IO and based on the error it is giving an assert. FileJournal assert you are getting in the aio path is relying on linux aio sy

[ceph-users] ceph-deploy prepare doesnt mount osd

2016-05-17 Thread Stefan Eriksson
Hi I'm running hammer 0.94.7 (centos 7) and have issues with deploying new osd's they doenst mount after initiation. I run using this command: ceph-deploy osd prepare ceph01-osd02:sdj:/journals/osd.38 everything seem fine but I get this in the osd log:2016-05-17 11:40:41.298846 7fe4c619b880

Re: [ceph-users] v10.2.1 Jewel released

2016-05-17 Thread Ken Dreyer
On Mon, May 16, 2016 at 11:14 PM, Karsten Heymann wrote: > the updated debian packages are *still* missing ceph-{mon,osd}.target. > Was it intentional to release the point release without the fix? It was not intentional. Teuthology does not test systemd, so these sort of things tend to fall off

[ceph-users] PG stuck incomplete after power failure.

2016-05-17 Thread Hein-Pieter van Braam
Hello, Today we had a power failure in a rack housing our OSD servers. We had 7 of our 30 total OSD nodes down. Of the affect PG 2 out of the 3 OSDs went down. After everything was back and mostly healthy I found one placement group marked as incomplete. I can't figure out why.  I'm running ceph

[ceph-users] OSD process doesn't die immediately after device disappears

2016-05-17 Thread Marcel Lauhoff
Hi, we recently played the good ol' pull a harddrive game and wondered, why the OSD process took a couple of minutes to recognize their misfortune. In our configuration two OSDs share an HDD: OSD n as its journal device, OSD n+1 as its filesystem. We expected that OSDs detect this kind of f

[ceph-users] How does EC pools support thousands of xattrs (XFS) but no omaps?

2016-05-17 Thread Chandan Kumar Singh
Hi While migrating to EC pools, I came to know that it does not support omaps but it allows thousands of xattrs (XFS). Are these xattrs being stored in a key-value store or in XFS file system? Regards Chandan ___ ceph-users mailing list ceph-users@lists

[ceph-users] Jewel CephFS quota (setfattr, getfattr)

2016-05-17 Thread Edgaras Lukoševičius
Hello, I have ceph 10.2 (Jewel) running with CephFS on CentOS 7.2 which is mounted using ceph-fuse 10.2. Attributes ceph.quota.max_files and ceph.quota.max_bytes doesn't work. # setfattr -n ceph.quota.max_files -v 10 /home/quotatest1 # setfattr -n ceph.quota.max_bytes -v 100 /home/quotates

[ceph-users] (no subject)

2016-05-17 Thread Bruce
unsubscribe ceph-users ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Brett Niver
Hi Oliver, Our corresponding RHCS downstream release of CephFS will be labeled "Tech Preview" which means its unsupported, but we believe that it's stable enough for experimentation. When we do release Cephfs as "production ready" that means we've done even more exhaustive testing and that this i

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Mark Nelson
On 05/17/2016 01:27 AM, Andrus, Brian Contractor wrote: Yes, I use the fuse client because the kernel client isn't happy with selinux settings. I have experienced the same symptoms with both clients, however. Yes, the clients that had nothing were merely mounted and nothing, not even an 'ls'

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Christian Balzer
Hello, for the record, I did the exact same sequence (no MDS) on my test cluster with exactly the same results. Didn't report it as I assumed it to be a more noisy (but harmless) upgrade artifact. Christian On Tue, 17 May 2016 14:07:21 +0200 Dan van der Ster wrote: > On Tue, May 17, 2016 at

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Dan van der Ster
On Tue, May 17, 2016 at 1:56 PM, Sage Weil wrote: > On Tue, 17 May 2016, Dan van der Ster wrote: >> Hi Sage et al, >> >> I'm updating our pre-prod cluster from 0.94.6 to 0.94.7 and after >> upgrading the ceph-mon's I'm getting loads of warnings like: >> >> 2016-05-17 10:01:29.314785 osd.76 [WRN] f

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Sage Weil
On Tue, 17 May 2016, Dan van der Ster wrote: > Hi Sage et al, > > I'm updating our pre-prod cluster from 0.94.6 to 0.94.7 and after > upgrading the ceph-mon's I'm getting loads of warnings like: > > 2016-05-17 10:01:29.314785 osd.76 [WRN] failed to encode map e103116 > with expected crc > > I've

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Oliver Dzombic
Hi Brett, aside from the question if what Brian experience has anything to do with code stability: since this is new for me, that there is a difference between "stable" and "production ready" i would be happy if you could tell me how the table looks like. One of the team was joking something lik

Re: [ceph-users] ceph-mon.target not enabled

2016-05-17 Thread Ruben Kerkhof
Hi Tim, On Mon, May 16, 2016 at 12:40 PM, Tim Serong wrote: > Enablement of the various ceph targets should happen automatically at > RPM install since https://github.com/ceph/ceph/commit/53b1a67, but that > landed on approximately the same day as your email, so I guess wasn't in > the packages y

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread John Spray
On Tue, May 17, 2016 at 7:27 AM, Andrus, Brian Contractor wrote: > Yes, I use the fuse client because the kernel client isn't happy with selinux > settings. > I have experienced the same symptoms with both clients, however. Including giving the same low-ish performance? > Yes, the clients that

[ceph-users] CephFS Jewel not using cache tiering much

2016-05-17 Thread Daniel van Ham Colchete
Hello everyone! I'm putting CephFS in production here to host Dovecot mailboxes. That's a big use case in the Dovecot community. Versions: Ubuntu 14.04 LTS with kerrnel 4.4.0-22-generic Ceph 10.2.1-1trusty CephFS uses the kernel client Right now I'm migrating my users to this new systems. That s

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread koukou73gr
Same here. Warnings appeared for OSDs running the .6 version each time one of the rest was restarted to the .7 version. When the last .6 OSD host was upgraded, there where no more warnings from the rest. Cluster seems happy :) -K. On 05/17/2016 11:04 AM, Dan van der Ster wrote: > Hi Sage et a

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Max A. Krasilnikov
Hello! On Tue, May 17, 2016 at 10:04:41AM +0200, dan wrote: > Hi Sage et al, > I'm updating our pre-prod cluster from 0.94.6 to 0.94.7 and after > upgrading the ceph-mon's I'm getting loads of warnings like: > 2016-05-17 10:01:29.314785 osd.76 [WRN] failed to encode map e103116 > with expected

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Dan van der Ster
Hi Sage et al, I'm updating our pre-prod cluster from 0.94.6 to 0.94.7 and after upgrading the ceph-mon's I'm getting loads of warnings like: 2016-05-17 10:01:29.314785 osd.76 [WRN] failed to encode map e103116 with expected crc I've seen that error is whitelisted in the qa-suite: https://github