Re: [ceph-users] Automatically timing out/removing dead hosts?

2015-01-20 Thread Gregory Farnum
On Tue, Jan 20, 2015 at 1:32 AM, Christopher Armstrong wrote: > Hi folks, > > We have many users who run Deis on AWS, and our default configuration places > hosts in an autoscaling group. Ceph runs on all hosts in the cluster > (monitors and OSDs), and users have reported losing quorum after havin

[ceph-users] CEPHFS with Erasure Coded Pool for Data and Replicated Pool for Meta Data

2015-01-20 Thread Mohamed Pakkeer
Hi all, We are trying to create 2 PB scale Ceph storage cluster for file system access using erasure coded profiles in giant release. Can we create Erasure coded pool (k+m = 10 +3) for data and replicated (4 replicas) pool for metadata for creating CEPHFS? What are the pros and cons of using two d

Re: [ceph-users] New firefly tiny cluster stuck unclean

2015-01-20 Thread Eneko Lacunza
Hi all, Finally this was fixed this way: # ceph osd pool set rbd size 1 (wait some seconds for HEALTH_OK) # ceph osd pool set rbd size 2 (wait almost an hour for HEALTH_OK after backfilling) I wanted to avoid this but didn't want to leave the cluster in bad state all night :) I really think t

Re: [ceph-users] Automatically timing out/removing dead hosts?

2015-01-20 Thread Christopher Armstrong
> Can't you hook into some system that tells you when nodes are gone and use that to do this, instead of waiting for timeouts? I wish we could! The AWS autoscaler will attempt to shut down instances gracefully, but not infrequently they are shut down forcefully. And there's no way I can find to te

Re: [ceph-users] Behaviour of Ceph while OSDs are down

2015-01-20 Thread Gregory Farnum
On Tue, Jan 20, 2015 at 2:40 AM, Christian Eichelmann wrote: > Hi all, > > I want to understand what Ceph does if several OSDs are down. First of our, > some words to our Setup: > > We have 5 Monitors and 12 OSD Server, each has 60x2TB Disks. These Servers > are spread across 4 racks in our datace

[ceph-users] Ceph-btrfs layout

2015-01-20 Thread James
Hello. I'm new to ceph, so suggestions and guidance and advice as to what works and what fails is most appreciated. Hardware (3) FX8350, 32G ram, (2)-2T drives each, running gentoo. Each system has btrfs-raid one set up like so: Disklabel type: gpt Device Start End Size Typ

[ceph-users] rbd to rbd file copy using 100% cpu

2015-01-20 Thread Shain Miley
Hello, Long story short...last night I did something similar to what Edwin did here: http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/16314 In order to begin to try and fix my mistake, I created another rbd image (in the same pool) and mounted it on the server as well: root@cephmou

[ceph-users] New firefly tiny cluster stuck unclean

2015-01-20 Thread Eneko Lacunza
Hi all, I've just created a new ceph cluster for RBD with latest firefly: - 3 monitors - 2 OSD nodes, each has 1 s3700 (journals) + 2 x 3TB WD red (osd) Network is 1gbit, different physical interfaces for public and private network. There's only one pool "rbd", size=2. There are just 5 rbd dev

[ceph-users] Behaviour of Ceph while OSDs are down

2015-01-20 Thread Christian Eichelmann
Hi all, I want to understand what Ceph does if several OSDs are down. First of our, some words to our Setup: We have 5 Monitors and 12 OSD Server, each has 60x2TB Disks. These Servers are spread across 4 racks in our datacenter. Every rack holds 3 OSD Server. We have a replication factor of

[ceph-users] Automatically timing out/removing dead hosts?

2015-01-20 Thread Christopher Armstrong
Hi folks, We have many users who run Deis on AWS, and our default configuration places hosts in an autoscaling group. Ceph runs on all hosts in the cluster (monitors and OSDs), and users have reported losing quorum after having several autoscaling events (new nodes getting added, old nodes termina