[ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
Hi, I'm building a cluster where two nodes replicate objects inside. I found that shutting down just one of the nodes (the second one), makes everything "incomplete". I cannot find why, since crushmap looks good to me. after shutting down one node cluster 9028f4da-0d77-462b-be9b-dbdf7f

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Jean-Charles Lopez
Hi Do tou have chooseleaf type host or type node in your crush map? How many OSDs do you run on each hosts? Thx JC On Saturday, April 19, 2014, Gonzalo Aguilar Delgado < gagui...@aguilardelgado.com> wrote: > Hi, > > I'm building a cluster where two nodes replicate objects inside. I found > tha

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Jean-Charles Lopez
Hi again Looked at your ceph -s. You have only 2 OSDs, one on each node. The default replica count is 2, the default crush map says each replica on a different host, or may be you set it to 2 different OSDs. Anyway, when one of your OSD goes down, Ceph can no longer find another OSDs to host the

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Michael J. Kidd
You may also want to check your 'min_size'... if it's 2, then you'll be incomplete even with 1 complete copy. ceph osd dump | grep pool You can reduce the min size with the following syntax: ceph osd pool set min_size 1 Thanks, Michael J. Kidd Sent from my mobile device. Please excuse brevit

[ceph-users] Backfill and Recovery traffic shaping

2014-04-19 Thread Greg Poirier
We have a cluster in a sub-optimal configuration with data and journal colocated on OSDs (that coincidentally are spinning disks). During recovery/backfill, the entire cluster suffers degraded performance because of the IO storm that backfills cause. Client IO becomes extremely latent. I've tried

Re: [ceph-users] Backfill and Recovery traffic shaping

2014-04-19 Thread Mike Dawson
Hi Greg, On 4/19/2014 2:20 PM, Greg Poirier wrote: We have a cluster in a sub-optimal configuration with data and journal colocated on OSDs (that coincidentally are spinning disks). During recovery/backfill, the entire cluster suffers degraded performance because of the IO storm that backfills

Re: [ceph-users] Backfill and Recovery traffic shaping

2014-04-19 Thread Greg Poirier
On Saturday, April 19, 2014, Mike Dawson wrote: > > > With a workload consisting of lots of small writes, I've seen client IO > starved with as little as 5Mbps of traffic per host due to spindle > contention once deep-scrub and/or recovery/backfill start. Co-locating OSD > Journals on the same spi

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
Hi all, first, thank you all for your answers. I will try to respond everyone and to everything. First, ceph osd dump | grep pool pool 0 'data' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 100 pgp_num 64 last_change 80 owner 0 flags hashpspool crash_replay_inter

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
Hi Michael, It worked. I didn't realized of this because docs it installs two osd nodes and says that would become active+clean after installing them. (Something that didn't worked for me because the 3 replicas problem). http://ceph.com/docs/master/start/quick-ceph-deploy/ Now I can shutdow

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Michael J. Kidd
> Can I remove safely default pools? Yes, as long as you're not using the default pools to store data, you can delete them. > Why total size is about 1GB?, because it should have 500MB since 2 replicas. I'm assuming that you're talking about the output of 'ceph df' or 'rados df'. These commands re