Hi everyone,
Up until recently, we were using GlusterFS to have two web servers in
sync so we could take one down and switch back and forth between them -
e.g. for maintenance or failover. Usually, both were running, though.
The performance was abysmal, unfortunately. Copying many small files on
the file system caused outages for several minutes - simply
unacceptable. So I found Ceph. It's fairly new but I thought I'd give it
a try. I liked especially the good, detailed documentation, the
configurability and the many command-line tools which allow you to find
out what is going on with your Cluster. All of this is severly lacking
with GlusterFS IMHO.
Because we're on a very tiny budget for this project we cannot currently
have more than two file system servers. I added a small Virtual Server,
though, only for monitoring. So at least we have 3 monitoring nodes. I
also created 3 MDS's, though as far as I understood, two are only for
standby. To sum it up, we have:
server0: Admin (Deployment started from here) + Monitor + MDS
server1: Monitor + MDS + OSD
server2: Monitor + MDS + OSD
So, the OSD is on server1 and server2 which are next to each other
connected by a local GigaBit-Ethernet connection. The cluster is mounted
(also on server1 and server2) as /var/www and Apache is serving files
off the cluster.
I've used these configuration settings:
osd pool default size = 2
osd pool default min_size = 1
My idea was that by default everything should be replicated on 2 servers
i.e. each file is normally written on server1 and server2. In case of
emergency though (one server has a failure), it's better to keep
operating and only write the file to one server. Therefore, i set
min_size = 1. My further understanding is (correct me if I'm wrong),
that when the server comes back online, the files that were written to
only 1 server during the outage will automatically be replicated to the
server that has come back online.
So far, so good. With two servers now online, the performance is
light-years away from sluggish GlusterFS. I've also worked with
XtreemFS, OCFS2, AFS and never had such a good performance with any
Cluster. In fact it's so blazingly fast, that I had to check twice I
really had the cluster mounted and wasnt accidentally working on the
hard drive. Impressive. I can edit files on server1 and they are
immediately changed on server2 and vice versa. Great!
Unfortunately, when I'm now stopping all ceph-Services on server1, the
websites on server2 start to hang/freeze. And "ceph health" shows "#x
blocked requests". Now, what I don't understand: Why is it blocking?
Shouldnt both servers have the file? And didn't I set min_size to "1"?
And if there are a few files (could be some unimportant stuff) that's
missing on one of the servers: How can I abort the blocking? I'd rather
have a missing file or whatever, then a completely blocking website.
Are my files really duplicated 1:1 - or are they perhaps spread evenly
between both OSDs? Do I have to edit the crushmap to achieve a real
"RAID-1"-type of replication? Is there a command to find out for a
specific file where it actually resides and whether it has really been
Thank you!
ceph-users mailing list