Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Christian Balzer
On Fri, 25 Jul 2014 13:31:34 +1000 Matt Harlum wrote:

> Hi,
> 
> I’ve purchased a couple of 45Drives enclosures and would like to figure
> out the best way to configure these for ceph?
> 
That's the second time within a month somebody mentions these 45 drive
chassis. 
Would you mind elaborating which enclosures these are precisely?

I'm wondering especially about the backplane, as 45 is such an odd number.

Also if you don't mind, specify "a couple" and what your net storage
requirements are.

In fact, read this before continuing:
---
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
---

> Mainly I was wondering if it was better to set up multiple raid groups
> and then put an OSD on each rather than an OSD for each of the 45 drives
> in the chassis? 
> 
Steve already towed the conservative Ceph party line here, let me give you
some alternative views and options on top of that and to recap what I
wrote in the thread above.

In addition to his links, read this:
---
https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
---

Lets go from cheap and cheerful to "comes with racing stripes".

1) All spinning rust, all the time. Plunk in 45 drives, as JBOD behind the
cheapest (and densest) controllers you can get. Having the journal on the
disks will halve their performance, but you just wanted the space and are
not that pressed for IOPS. 
The best you can expect per node with this setup is something around 2300
IOPS with normal (7200RPM) disks.

2) Same as 1), but use controllers with a large HW cache (4GB Areca comes
to mind) in JBOD (or 45 times RAID0) mode. 
This will alleviate some of the thrashing problems, particular if you're
expecting high IOPS to be in short bursts.

3) Ceph Classic, basically what Steve wrote. 
32HDDs, 8SSDs for journals (you do NOT want an uneven spread of journals). 
This will give you sustainable 3200 IOPS, but of course the journals on
SSDs not only avoid all that trashing about on the disk but also allow for
coalescing of writes, so this is going to be fastest solution so far.
Of course you will need 3 of these at minimum for acceptable redundancy,
unlike 4) which just needs a replication level of 2.

4) The anti-cephalopod. See my reply from a month ago in the link above.
All the arguments apply, it very much depends upon your use case and
budget. In my case the higher density, lower cost and ease of maintaining
the cluster where well worth the lower IOPS.

5) We can improve upon 3) by using HW cached controllers of course. And
hey, you did need to connect those drive bays somehow anyway. ^o^ 
Maybe even squeeze some more out of it by having the SSD controller
separate from the HDD one(s).
This is as fast (IOPS) as it comes w/o going to full SSD.


Networking:
Either of the setups above will saturate a single 10Gb/s aka 1GB/s as
Steve noted. 
In fact 3) to 5) will be able to write up to 4GB/s in theory based on the
HDDs sequential performance, but that is unlikely to be seen in real live.
And of course your maximum write speed is  based on the speed of the SSDs.
So for example with 3) you would want those 8 SSDs to have write speeds of
about 250MB/s, giving you 2GB/s max write.
Which in turn means 2 10GB/s links at least, up to 4 if you want
redundancy and/or a separation of public and cluster network.

RAM:
The more, the merrier. 
It's relatively cheap and avoiding have to actually read from the disks
will make your write IOPS so much happier.

CPU:
You'll want something like Steve recommended for 3), I'd go with 2 8core
CPUs actually, so you have some Oomps to spare for the OS, IRQ handling,
etc. With 4) and actual 4 OSDs, about half of that will be fine, with the
expectation of Ceph code improvements. 

Mobo:
You're fine for overall PCIe bandwidth, even w/o going to PCIe v3. 
But you might have up to 3 HBAs/RAID cards and 2 network cards, so make
sure you and get this all into appropriate slots.

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD weight 0

2014-07-25 Thread Kapil Sharma
Hi,

I am using ceph-deploy to deploy my cluster. Whenever I try to add more
than one osd in a node, except the first osd, all the other osds get a
weight of 0, and they are in a state of down and out.

So, if I have three nodes in my cluster, I can successfully add 1 node
each in the three nodes, but the moment I try to add a second node in
any of the nodes, it gets a weight of 0 and goes down and out.

The capacity of all the disks is same.


cephdeploy@node-1:~/cluster> ceph osd tree
# idweight  type name   up/down reweight
-1  1.82root default
-2  1.82host node-1
0   1.82osd.0   up  1
1   0   osd.1   down0

There is no error as such after I run ceph-deploy activate command.

Has anyone seen this issue before ? 



Kind Regards,
Kapil.






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw monitoring

2014-07-25 Thread pragya jain
Hi all,

Please suggest me some open source monitoring tools which can monitor radosgw 
instances for coming user request traffic for uploading and downloading the 
stored data and also for monitoring other features of radosgw

Regards
Pragya Jain___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] monitoring features of statsd server

2014-07-25 Thread pragya jain
hi all,

please somebody help me to know about the metrics gathered by StatsD server for 
monitoring ceph storage cluster with radosgw client for object storage

Regards 
Pragya Jain___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD import format 1 & 2

2014-07-25 Thread NEVEU Stephane

Hi all,

One quick question about image format 1 & 2 :

I've got a img.qcow2 and I want to convert it :

The first solution is qemu-img convert -f qcow2 -O rbd img.qcow2 
rbd:/mypool/myimage

As far as I understood It will converted into format 1 which is the default one 
so I won't be able to clone my image.

Second solution is to import it directly into format 2 :
Rbd import -image-format 2 img.qcow2 mypool/myimage

But in this case, when I start my VM, my vm / filesystem turns readonly with 
many buffer IO error on dm-0.

I'm running Ubuntu 14.04 for both kvm host and VMs so kernel version is 
3.13.0-30

Any idea ?
Thx
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Could not find module rbd. CentOs 6.4

2014-07-25 Thread Pratik Rupala

Hi,

I am deploying firefly version on CentOs 6.4. I am following quick 
installation instructions available at ceph.com.

I have my customized kernel version in CentOs 6.4 which is 2.6.32.

I am able to create basic Ceph storage cluster with active+clean state. 
Now I am trying to create block device image on ceph client but it is 
giving messages as shown below:


[ceph@ceph-client1 ~]$ rbd create foo --size 1024
2014-07-25 22:31:48.519218 7f6721d43700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x6a7c50 sd=4 :0 s=1 pgs=0 cs=0 l=1 
c=0x6a8050).fault
2014-07-25 22:32:18.536771 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718006310 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f6718006580).fault
2014-07-25 22:33:09.598763 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f67180063e0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f6718007e70).fault
2014-07-25 22:34:08.621655 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:35:19.581978 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:36:23.694665 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:37:28.868293 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:38:29.159830 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:39:28.854441 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718001db0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f6718006990).fault
2014-07-25 22:40:14.581055 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718001ac0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f671800c950).fault
2014-07-25 22:41:03.794903 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718004d30 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f671800c950).fault
2014-07-25 22:42:12.537442 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x6a4640 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x6a4a00).fault
2014-07-25 22:43:18.912430 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718008300 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180080e0).fault
2014-07-25 22:44:24.129258 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718008300 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f6718008f80).fault
2014-07-25 22:45:29.174719 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f671800a150 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f671800a620).fault
2014-07-25 22:46:34.032246 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718008390 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f671800a620).fault
2014-07-25 22:47:39.551973 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718008390 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f67180077e0).fault
2014-07-25 22:48:39.342226 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
172.17.35.22:6800/1875 pipe(0x7f6718001db0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f6718003040).fault


I am not sure whether block device image has been created or not. 
Further I tried below command which fails:

[ceph@ceph-client1 ~]$ sudo rbd map foo
ERROR: modinfo: could not find module rbd
FATAL: Module rbd not found.
rbd: modprobe rbd failed! (256)

If I check the health of cluster it looks fine.
[ceph@node1 ~]$ ceph -s
cluster 98f22f5d-783b-43c2-8ae7-b97a715c9c86
 health HEALTH_OK
 monmap e1: 1 mons at {node1=172.17.35.17:6789/0}, election epoch 
1, quorum 0 node1

 osdmap e5972: 3 osds: 3 up, 3 in
  pgmap v20011: 192 pgs, 3 pools, 142 bytes data, 2 objects
190 MB used, 45856 MB / 46046 MB avail
 192 active+clean

Please let me know if I am doing anything wrong.

Regards,
Pratik Rupala
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Could not find module rbd. CentOs 6.4

2014-07-25 Thread Karan Singh
Hi Pratik

Ceph RBD support has been added in mainline Linux kernel starting 2.6.34 ,  The 
following errors shows that , RBD module is not present in kernel.

Its advisable to run latest stable kernel release if you need RBD to be working.

> ERROR: modinfo: could not find module rbd
> FATAL: Module rbd not found.
> rbd: modprobe rbd failed! (256)



- Karan -

On 25 Jul 2014, at 14:52, Pratik Rupala  wrote:

> Hi,
> 
> I am deploying firefly version on CentOs 6.4. I am following quick 
> installation instructions available at ceph.com.
> I have my customized kernel version in CentOs 6.4 which is 2.6.32.
> 
> I am able to create basic Ceph storage cluster with active+clean state. Now I 
> am trying to create block device image on ceph client but it is giving 
> messages as shown below:
> 
> [ceph@ceph-client1 ~]$ rbd create foo --size 1024
> 2014-07-25 22:31:48.519218 7f6721d43700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x6a7c50 sd=4 :0 s=1 pgs=0 cs=0 l=1 
> c=0x6a8050).fault
> 2014-07-25 22:32:18.536771 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718006310 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f6718006580).fault
> 2014-07-25 22:33:09.598763 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f67180063e0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f6718007e70).fault
> 2014-07-25 22:34:08.621655 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:35:19.581978 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:36:23.694665 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:37:28.868293 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:38:29.159830 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718007e70 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:39:28.854441 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718001db0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f6718006990).fault
> 2014-07-25 22:40:14.581055 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718001ac0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f671800c950).fault
> 2014-07-25 22:41:03.794903 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718004d30 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f671800c950).fault
> 2014-07-25 22:42:12.537442 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x6a4640 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x6a4a00).fault
> 2014-07-25 22:43:18.912430 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718008300 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180080e0).fault
> 2014-07-25 22:44:24.129258 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718008300 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f6718008f80).fault
> 2014-07-25 22:45:29.174719 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f671800a150 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f671800a620).fault
> 2014-07-25 22:46:34.032246 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718008390 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f671800a620).fault
> 2014-07-25 22:47:39.551973 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718008390 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f67180077e0).fault
> 2014-07-25 22:48:39.342226 7f6721b41700  0 -- 172.17.35.20:0/1003053 >> 
> 172.17.35.22:6800/1875 pipe(0x7f6718001db0 sd=5 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7f6718003040).fault
> 
> I am not sure whether block device image has been created or not. Further I 
> tried below command which fails:
> [ceph@ceph-client1 ~]$ sudo rbd map foo
> ERROR: modinfo: could not find module rbd
> FATAL: Module rbd not found.
> rbd: modprobe rbd failed! (256)
> 
> If I check the health of cluster it looks fine.
> [ceph@node1 ~]$ ceph -s
>cluster 98f22f5d-783b-43c2-8ae7-b97a715c9c86
> health HEALTH_OK
> monmap e1: 1 mons at {node1=172.17.35.17:6789/0}, election epoch 1, 
> quorum 0 node1
> osdmap e5972: 3 osds: 3 up, 3 in
>  pgmap v20011: 192 pgs, 3 pools, 142 bytes data, 2 objects
>190 MB used, 45856 MB / 46046 MB avail
> 192 active+clean
> 
> Please let me know if I am doing anything wrong.
> 
> Regards,
> Pratik Rupala
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.

Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Mark Nelson

On 07/25/2014 02:54 AM, Christian Balzer wrote:

On Fri, 25 Jul 2014 13:31:34 +1000 Matt Harlum wrote:


Hi,

I’ve purchased a couple of 45Drives enclosures and would like to figure
out the best way to configure these for ceph?


That's the second time within a month somebody mentions these 45 drive
chassis.
Would you mind elaborating which enclosures these are precisely?


I'm guessing the supermicro SC847E26:

http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm



I'm wondering especially about the backplane, as 45 is such an odd number.

Also if you don't mind, specify "a couple" and what your net storage
requirements are.

In fact, read this before continuing:
---
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
---


Mainly I was wondering if it was better to set up multiple raid groups
and then put an OSD on each rather than an OSD for each of the 45 drives
in the chassis?


Steve already towed the conservative Ceph party line here, let me give you
some alternative views and options on top of that and to recap what I
wrote in the thread above.

In addition to his links, read this:
---
https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
---

Lets go from cheap and cheerful to "comes with racing stripes".

1) All spinning rust, all the time. Plunk in 45 drives, as JBOD behind the
cheapest (and densest) controllers you can get. Having the journal on the
disks will halve their performance, but you just wanted the space and are
not that pressed for IOPS.
The best you can expect per node with this setup is something around 2300
IOPS with normal (7200RPM) disks.

2) Same as 1), but use controllers with a large HW cache (4GB Areca comes
to mind) in JBOD (or 45 times RAID0) mode.
This will alleviate some of the thrashing problems, particular if you're
expecting high IOPS to be in short bursts.

3) Ceph Classic, basically what Steve wrote.
32HDDs, 8SSDs for journals (you do NOT want an uneven spread of journals).
This will give you sustainable 3200 IOPS, but of course the journals on
SSDs not only avoid all that trashing about on the disk but also allow for
coalescing of writes, so this is going to be fastest solution so far.
Of course you will need 3 of these at minimum for acceptable redundancy,
unlike 4) which just needs a replication level of 2.

4) The anti-cephalopod. See my reply from a month ago in the link above.
All the arguments apply, it very much depends upon your use case and
budget. In my case the higher density, lower cost and ease of maintaining
the cluster where well worth the lower IOPS.

5) We can improve upon 3) by using HW cached controllers of course. And
hey, you did need to connect those drive bays somehow anyway. ^o^
Maybe even squeeze some more out of it by having the SSD controller
separate from the HDD one(s).
This is as fast (IOPS) as it comes w/o going to full SSD.


Networking:
Either of the setups above will saturate a single 10Gb/s aka 1GB/s as
Steve noted.
In fact 3) to 5) will be able to write up to 4GB/s in theory based on the
HDDs sequential performance, but that is unlikely to be seen in real live.
And of course your maximum write speed is  based on the speed of the SSDs.
So for example with 3) you would want those 8 SSDs to have write speeds of
about 250MB/s, giving you 2GB/s max write.
Which in turn means 2 10GB/s links at least, up to 4 if you want
redundancy and/or a separation of public and cluster network.

RAM:
The more, the merrier.
It's relatively cheap and avoiding have to actually read from the disks
will make your write IOPS so much happier.

CPU:
You'll want something like Steve recommended for 3), I'd go with 2 8core
CPUs actually, so you have some Oomps to spare for the OS, IRQ handling,
etc. With 4) and actual 4 OSDs, about half of that will be fine, with the
expectation of Ceph code improvements.

Mobo:
You're fine for overall PCIe bandwidth, even w/o going to PCIe v3.
But you might have up to 3 HBAs/RAID cards and 2 network cards, so make
sure you and get this all into appropriate slots.

Regards,

Christian



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD import format 1 & 2

2014-07-25 Thread NEVEU Stephane
I finally reconverted my only "format 1" image into format 2 so now everything 
is in format 2, but I'm still confused, my vm disks are still readonly (I've 
tried different images centos 6.5 with kernel 2.6.32 and ubuntu with 3.13), do 
I have to modprobe rbd on the host ?



De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de NEVEU 
Stephane
Envoyé : vendredi 25 juillet 2014 13:45
À : ceph-users@lists.ceph.com
Objet : [ceph-users] RBD import format 1 & 2


Hi all,

One quick question about image format 1 & 2 :

I've got a img.qcow2 and I want to convert it :

The first solution is qemu-img convert -f qcow2 -O rbd img.qcow2 
rbd:/mypool/myimage

As far as I understood It will converted into format 1 which is the default one 
so I won't be able to clone my image.

Second solution is to import it directly into format 2 :
Rbd import -image-format 2 img.qcow2 mypool/myimage

But in this case, when I start my VM, my vm / filesystem turns readonly with 
many buffer IO error on dm-0.

I'm running Ubuntu 14.04 for both kvm host and VMs so kernel version is 
3.13.0-30

Any idea ?
Thx
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Robert Fantini
Hello Christian.

 Our current setup has 4 osd's per node.When a drive  fails   the
cluster is almost unusable for data entry.   I want to change pour set up
so that under no circumstances ever happens.We used drbd for 8 years,
and our main concern is high availability .  1200bps  Modem speed feel cli
does not count as available.
 Network:  we use 2 IB switches and  bonding in fail over mode.
 Systems are two  Dell Poweredge r720 and Supermicro X8DT3 .

 So looking at how to do things better we will try  #4 anti-cephalopod.

We'll switch to using raid-10 or raid-6 and have one osd per node, using
high end raid controllers,  hot spares etc.

And use one Intel 200gb S3700 per node for journal

My questions:

is there a minimum number of OSD's which should be used?

should  OSD's per node be the same?

best regards, Rob
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD import format 1 & 2

2014-07-25 Thread Campbell, Bill
When you run qemu-img you are essentially converting the qcow2 image to
the appropriate raw format during the conversion and import process to the
cluster.  When you use rbd import you are not doing a conversion, so the
image is being imported AS IS (you can validate this by looking at the
size of the image after importing).  In order to get to format 2 initially
you may need to convert the qcow2 to raw first, then import.
Unfortunately I don’t think qemu-img supports outputting to stdout, so
this will have to be a two-step process.



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
NEVEU Stephane
Sent: Friday, July 25, 2014 8:57 AM
To: NEVEU Stephane; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] RBD import format 1 & 2



I finally reconverted my only “format 1” image into format 2 so now
everything is in format 2, but I’m still confused, my vm disks are still
readonly (I’ve tried different images centos 6.5 with kernel 2.6.32 and
ubuntu with 3.13), do I have to modprobe rbd on the host ?







De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de
NEVEU Stephane
Envoyé : vendredi 25 juillet 2014 13:45
À : ceph-users@lists.ceph.com
Objet : [ceph-users] RBD import format 1 & 2





Hi all,



One quick question about image format 1 & 2 :



I’ve got a img.qcow2 and I want to convert it :



The first solution is qemu-img convert –f qcow2 –O rbd img.qcow2
rbd:/mypool/myimage



As far as I understood It will converted into format 1 which is the
default one so I won’t be able to clone my image.



Second solution is to import it directly into format 2 :

Rbd import –image-format 2 img.qcow2 mypool/myimage



But in this case, when I start my VM, my vm / filesystem turns readonly
with many buffer IO error on dm-0.



I’m running Ubuntu 14.04 for both kvm host and VMs so kernel version is
3.13.0-30



Any idea ?

Thx


NOTICE: Protect the information in this message in accordance with the 
company's security policies. If you received this message in error, immediately 
notify the sender and destroy all copies.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues compiling Ceph (master branch) on Debian Wheezy (armhf)

2014-07-25 Thread Deven Phillips
root@cubie01:~# aptitude search perftools
p   google-perftools   - command
line utilities to analyze the performance of C++ programs
root@cubie01:~# aptitude install google-perftools
The following NEW packages will be installed:
  google-perftools{b}
The following packages are RECOMMENDED but will NOT be installed:
  graphviz gv
0 packages upgraded, 1 newly installed, 0 to remove and 40 not upgraded.
Need to get 78.3 kB of archives. After unpacking 238 kB will be used.
The following packages have unmet dependencies:
 google-perftools : Depends: libgoogle-perftools4 which is a virtual
package.
Depends: curl but it is not going to be installed.
The following actions will resolve these dependencies:

 Keep the following packages at their current version:
1) google-perftools [Not Installed]



Accept this solution? [Y/n/q/?] n

*** No more solutions available ***



On Fri, Jul 25, 2014 at 10:51 AM, zhu qiang 
wrote:

> Hi,
>
> may be you miss : libgoogle-perftools-dev.
>
> try apt-get install  -y libgoogle-perftools-dev
>
>
>
> best.
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Deven Phillips
> *Sent:* Friday, July 25, 2014 11:55 AM
> *To:* ceph-users@lists.ceph.com
> *Subject:* [ceph-users] Issues compiling Ceph (master branch) on Debian
> Wheezy (armhf)
>
>
>
> Hi all,
>
>
>
> I am in the process of installing and setting up Ceph on a group of
> Allwinner A20 SoC mini computers. They are armhf devices and I have
> installed Cubian (http://cubian.org/), which is a port of Debian Wheezy.
> I tried to follow the instructions at:
>
>
>
> http://ceph.com/docs/master/install/build-ceph/
>
>
>
> But I found that some needed dependencies were not installed. Below is a
> list of the items I had to install in order to compile Ceph for these
> devices:
>
>
>
> uuid-dev
>
> libblkid-dev
>
> libudev-dev
>
> libatomic-ops-dev
>
> libsnappy-dev
>
> libleveldb-dev
>
> xfslibs-dev
>
> libboost-all-dev
>
>
>
> I also had to specify --without-tcmalloc because I could not find a
> package which implements that for the armhf platform.
>
>
>
> I hope this helps others!!
>
>
>
> Deven Phillips
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues compiling Ceph (master branch) on Debian Wheezy (armhf)

2014-07-25 Thread Owen Synge
Dear Deven,

Another solution is to compile leveldb and ceph without tcmalloc support :)

Ceph and leveldb work just fine without gperftools, and I am yet to do
benchmarks as to how much performance benefit you get from
google-perftools replacement tcmalloc of globc malloc.

Best regards

Owen



On 07/25/2014 04:52 PM, Deven Phillips wrote:
> root@cubie01:~# aptitude search perftools
> p   google-perftools   - command
> line utilities to analyze the performance of C++ programs
> root@cubie01:~# aptitude install google-perftools
> The following NEW packages will be installed:
>   google-perftools{b}
> The following packages are RECOMMENDED but will NOT be installed:
>   graphviz gv
> 0 packages upgraded, 1 newly installed, 0 to remove and 40 not upgraded.
> Need to get 78.3 kB of archives. After unpacking 238 kB will be used.
> The following packages have unmet dependencies:
>  google-perftools : Depends: libgoogle-perftools4 which is a virtual
> package.
> Depends: curl but it is not going to be installed.
> The following actions will resolve these dependencies:
> 
>  Keep the following packages at their current version:
> 1) google-perftools [Not Installed]
> 
> 
> 
> Accept this solution? [Y/n/q/?] n
> 
> *** No more solutions available ***
> 
> 
> 
> On Fri, Jul 25, 2014 at 10:51 AM, zhu qiang 
> wrote:
> 
>> Hi,
>>
>> may be you miss : libgoogle-perftools-dev.
>>
>> try apt-get install  -y libgoogle-perftools-dev
>>
>>
>>
>> best.
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Deven Phillips
>> *Sent:* Friday, July 25, 2014 11:55 AM
>> *To:* ceph-users@lists.ceph.com
>> *Subject:* [ceph-users] Issues compiling Ceph (master branch) on Debian
>> Wheezy (armhf)
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I am in the process of installing and setting up Ceph on a group of
>> Allwinner A20 SoC mini computers. They are armhf devices and I have
>> installed Cubian (http://cubian.org/), which is a port of Debian Wheezy.
>> I tried to follow the instructions at:
>>
>>
>>
>> http://ceph.com/docs/master/install/build-ceph/
>>
>>
>>
>> But I found that some needed dependencies were not installed. Below is a
>> list of the items I had to install in order to compile Ceph for these
>> devices:
>>
>>
>>
>> uuid-dev
>>
>> libblkid-dev
>>
>> libudev-dev
>>
>> libatomic-ops-dev
>>
>> libsnappy-dev
>>
>> libleveldb-dev
>>
>> xfslibs-dev
>>
>> libboost-all-dev
>>
>>
>>
>> I also had to specify --without-tcmalloc because I could not find a
>> package which implements that for the armhf platform.
>>
>>
>>
>> I hope this helps others!!
>>
>>
>>
>> Deven Phillips
>>
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Forcing ceph-mon to bind to a certain IP address

2014-07-25 Thread fake rao
I would like ceph-mon to bind to 0.0.0.0 since it is running on a machine
that gets its IP from a DHCP server and the IP changes on every boot.

Is there a way to specify this in the ceph.conf file?

Thanks
Akshay
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Christian Balzer
On Fri, 25 Jul 2014 07:24:26 -0500 Mark Nelson wrote:

> On 07/25/2014 02:54 AM, Christian Balzer wrote:
> > On Fri, 25 Jul 2014 13:31:34 +1000 Matt Harlum wrote:
> >
> >> Hi,
> >>
> >> I’ve purchased a couple of 45Drives enclosures and would like to
> >> figure out the best way to configure these for ceph?
> >>
> > That's the second time within a month somebody mentions these 45 drive
> > chassis.
> > Would you mind elaborating which enclosures these are precisely?
> 
> I'm guessing the supermicro SC847E26:
> 
> http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm
> 
Le Ouch!

They really must be getting  desperate for high density chassis that are
not top loading at Supermicro. 

Well, if I read that link and the actual manual correctly, the most one
can hope to get from this is 48Gb/s (2 mini-SAS with 4 lanes each) which is
short of what 45 regular HDDs can dish out (or take in). 
And that's ignoring the the inherent deficiencies when dealing with port
expanders.

Either way, a head for this kind of enclosure would need pretty much all
the things mentioned before, a low density (8 lanes), but high performance
and large cache controller and definitely SSDs for journals.

There must be some actual threshold, but my gut feeling tells me that
something slightly less dense where you don't have to get another case for
the head might turn out cheaper. 
Especially if a 1U head (RAID/HBA and network cards) and space for
journal SSDs doesn't cut it.

Christian

> >
> > I'm wondering especially about the backplane, as 45 is such an odd
> > number.
> >
> > Also if you don't mind, specify "a couple" and what your net storage
> > requirements are.
> >
> > In fact, read this before continuing:
> > ---
> > https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
> > ---
> >
> >> Mainly I was wondering if it was better to set up multiple raid groups
> >> and then put an OSD on each rather than an OSD for each of the 45
> >> drives in the chassis?
> >>
> > Steve already towed the conservative Ceph party line here, let me give
> > you some alternative views and options on top of that and to recap
> > what I wrote in the thread above.
> >
> > In addition to his links, read this:
> > ---
> > https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
> > ---
> >
> > Lets go from cheap and cheerful to "comes with racing stripes".
> >
> > 1) All spinning rust, all the time. Plunk in 45 drives, as JBOD behind
> > the cheapest (and densest) controllers you can get. Having the journal
> > on the disks will halve their performance, but you just wanted the
> > space and are not that pressed for IOPS.
> > The best you can expect per node with this setup is something around
> > 2300 IOPS with normal (7200RPM) disks.
> >
> > 2) Same as 1), but use controllers with a large HW cache (4GB Areca
> > comes to mind) in JBOD (or 45 times RAID0) mode.
> > This will alleviate some of the thrashing problems, particular if
> > you're expecting high IOPS to be in short bursts.
> >
> > 3) Ceph Classic, basically what Steve wrote.
> > 32HDDs, 8SSDs for journals (you do NOT want an uneven spread of
> > journals). This will give you sustainable 3200 IOPS, but of course the
> > journals on SSDs not only avoid all that trashing about on the disk
> > but also allow for coalescing of writes, so this is going to be
> > fastest solution so far. Of course you will need 3 of these at minimum
> > for acceptable redundancy, unlike 4) which just needs a replication
> > level of 2.
> >
> > 4) The anti-cephalopod. See my reply from a month ago in the link
> > above. All the arguments apply, it very much depends upon your use
> > case and budget. In my case the higher density, lower cost and ease of
> > maintaining the cluster where well worth the lower IOPS.
> >
> > 5) We can improve upon 3) by using HW cached controllers of course. And
> > hey, you did need to connect those drive bays somehow anyway. ^o^
> > Maybe even squeeze some more out of it by having the SSD controller
> > separate from the HDD one(s).
> > This is as fast (IOPS) as it comes w/o going to full SSD.
> >
> >
> > Networking:
> > Either of the setups above will saturate a single 10Gb/s aka 1GB/s as
> > Steve noted.
> > In fact 3) to 5) will be able to write up to 4GB/s in theory based on
> > the HDDs sequential performance, but that is unlikely to be seen in
> > real live. And of course your maximum write speed is  based on the
> > speed of the SSDs. So for example with 3) you would want those 8 SSDs
> > to have write speeds of about 250MB/s, giving you 2GB/s max write.
> > Which in turn means 2 10GB/s links at least, up to 4 if you want
> > redundancy and/or a separation of public and cluster network.
> >
> > RAM:
> > The more, the merrier.
> > It's relatively cheap and avoiding have to actually read from the disks
> > will make your write IOPS so much happier.
> >
> > CPU:
> > You'll want something like Steve recommended for 3), I'd go with 2

Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Mark Nelson

On 07/25/2014 12:04 PM, Christian Balzer wrote:

On Fri, 25 Jul 2014 07:24:26 -0500 Mark Nelson wrote:


On 07/25/2014 02:54 AM, Christian Balzer wrote:

On Fri, 25 Jul 2014 13:31:34 +1000 Matt Harlum wrote:


Hi,

I’ve purchased a couple of 45Drives enclosures and would like to
figure out the best way to configure these for ceph?


That's the second time within a month somebody mentions these 45 drive
chassis.
Would you mind elaborating which enclosures these are precisely?


I'm guessing the supermicro SC847E26:

http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm


Le Ouch!

They really must be getting  desperate for high density chassis that are
not top loading at Supermicro.

Well, if I read that link and the actual manual correctly, the most one
can hope to get from this is 48Gb/s (2 mini-SAS with 4 lanes each) which is
short of what 45 regular HDDs can dish out (or take in).
And that's ignoring the the inherent deficiencies when dealing with port
expanders.

Either way, a head for this kind of enclosure would need pretty much all
the things mentioned before, a low density (8 lanes), but high performance
and large cache controller and definitely SSDs for journals.

There must be some actual threshold, but my gut feeling tells me that
something slightly less dense where you don't have to get another case for
the head might turn out cheaper.
Especially if a 1U head (RAID/HBA and network cards) and space for
journal SSDs doesn't cut it.


Personally I'm a much bigger fan of the SC847A.  No expanders in the 
backplane, 36 3.5" bays with the MB integrated.  It's a bit old at this 
point and the fattwin nodes can go denser (both in terms of nodes and 
drives), but I've been pretty happy with it as a performance test 
platform.  It's really nice having the drives directly connected to the 
controllers.  having 4-5 controllers in 1 box is a bit tricky though. 
The fattwin hadoop nodes are a bit nicer in that regard.


Mark



Christian



I'm wondering especially about the backplane, as 45 is such an odd
number.

Also if you don't mind, specify "a couple" and what your net storage
requirements are.

In fact, read this before continuing:
---
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
---


Mainly I was wondering if it was better to set up multiple raid groups
and then put an OSD on each rather than an OSD for each of the 45
drives in the chassis?


Steve already towed the conservative Ceph party line here, let me give
you some alternative views and options on top of that and to recap
what I wrote in the thread above.

In addition to his links, read this:
---
https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
---

Lets go from cheap and cheerful to "comes with racing stripes".

1) All spinning rust, all the time. Plunk in 45 drives, as JBOD behind
the cheapest (and densest) controllers you can get. Having the journal
on the disks will halve their performance, but you just wanted the
space and are not that pressed for IOPS.
The best you can expect per node with this setup is something around
2300 IOPS with normal (7200RPM) disks.

2) Same as 1), but use controllers with a large HW cache (4GB Areca
comes to mind) in JBOD (or 45 times RAID0) mode.
This will alleviate some of the thrashing problems, particular if
you're expecting high IOPS to be in short bursts.

3) Ceph Classic, basically what Steve wrote.
32HDDs, 8SSDs for journals (you do NOT want an uneven spread of
journals). This will give you sustainable 3200 IOPS, but of course the
journals on SSDs not only avoid all that trashing about on the disk
but also allow for coalescing of writes, so this is going to be
fastest solution so far. Of course you will need 3 of these at minimum
for acceptable redundancy, unlike 4) which just needs a replication
level of 2.

4) The anti-cephalopod. See my reply from a month ago in the link
above. All the arguments apply, it very much depends upon your use
case and budget. In my case the higher density, lower cost and ease of
maintaining the cluster where well worth the lower IOPS.

5) We can improve upon 3) by using HW cached controllers of course. And
hey, you did need to connect those drive bays somehow anyway. ^o^
Maybe even squeeze some more out of it by having the SSD controller
separate from the HDD one(s).
This is as fast (IOPS) as it comes w/o going to full SSD.


Networking:
Either of the setups above will saturate a single 10Gb/s aka 1GB/s as
Steve noted.
In fact 3) to 5) will be able to write up to 4GB/s in theory based on
the HDDs sequential performance, but that is unlikely to be seen in
real live. And of course your maximum write speed is  based on the
speed of the SSDs. So for example with 3) you would want those 8 SSDs
to have write speeds of about 250MB/s, giving you 2GB/s max write.
Which in turn means 2 10GB/s links at least, up to 4 if you want
redundancy and/or a separation of public and cluster network.

RAM:
The more, the m

Re: [ceph-users] ceph-extras for rhel7

2014-07-25 Thread Simon Ironside

Hi again,

I've had a look at the qemu-kvm SRPM and RBD is intentionally disabled 
in the RHEL 7.0 release packages. There's a block in the .spec file that 
reads:


%if %{rhev}
--enable-live-block-ops \
--enable-ceph-support \
%else
--disable-live-block-ops \
--disable-ceph-support \
%endif

rhev is defined 0 at the top of the file, setting this to 1 and 
rebuilding after sorting the build dependencies yields some new packages 
with RBD support and a -rhev suffix that install and work on RHEL 7.0 
just fine. I tested with a KVM VM using RBD/cephx storage via 
libvirt/qemu directly. As I was using virtio-scsi, TRIM also worked.


iasl was the only build requirement I wasn't able to satisfy so I 
commented it out (the comments state that it's not a hard requirement). 
This doesn't seem to have had any ill effects for me.


To avoid the -rhev suffix I ultimately made the attached changes to the 
spec file before rebuilding them for myself.


Cheers,
Simon

On 21/07/14 14:23, Simon Ironside wrote:

Hi,

Is there going to be ceph-extras repos for rhel7?

Unless I'm very much mistaken I think the RHEL 7.0 release qemu-kvm
packages don't support RBD.

Cheers,
Simon.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--- qemu-kvm.spec.orig
+++ qemu-kvm.spec
@@ -74,7 +74,7 @@
 Summary: QEMU is a FAST! processor emulator
 Name: %{pkgname}%{?pkgsuffix}
 Version: 1.5.3
-Release: 60%{?dist}
+Release: 61%{?dist}
 # Epoch because we pushed a qemu-1.0 package. AIUI this can't ever be dropped
 Epoch: 10
 License: GPLv2+ and LGPLv2+ and BSD
@@ -2273,7 +2273,7 @@
 # iasl and cpp for acpi generation (not a hard requirement as we can use
 # pre-compiled files, but it's better to use this)
 %ifarch %{ix86} x86_64
-BuildRequires: iasl
+#BuildRequires: iasl
 BuildRequires: cpp
 %endif
 
@@ -3551,14 +3551,14 @@
 --enable-ceph-support \
 %else
 --disable-live-block-ops \
---disable-ceph-support \
+--enable-ceph-support \
 %endif
 --disable-live-block-migration \
 --enable-glusterfs \
 %if %{rhev}
 --block-drv-rw-whitelist=qcow2,raw,file,host_device,nbd,iscsi,gluster,rbd \
 %else
---block-drv-rw-whitelist=qcow2,raw,file,host_device,nbd,iscsi,gluster \
+--block-drv-rw-whitelist=qcow2,raw,file,host_device,nbd,iscsi,gluster,rbd \
 %endif
 --block-drv-ro-whitelist=vmdk,vhdx,vpc \
 "$@"
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Schweiss, Chip
On Fri, Jul 25, 2014 at 12:04 PM, Christian Balzer  wrote:

>
> Well, if I read that link and the actual manual correctly, the most one
> can hope to get from this is 48Gb/s (2 mini-SAS with 4 lanes each) which is
> short of what 45 regular HDDs can dish out (or take in).
> And that's ignoring the the inherent deficiencies when dealing with port
> expanders.
>
>
Actually the limit is 96Gb/s  The front and back have independent SAS
expanders.  There are 4 internal to external mini-SAS cables.

-Chip
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] anti-cephalopod question

2014-07-25 Thread Robert Fantini
I've a question regarding advice from these threads:
https://mail.google.com/mail/u/0/#label/ceph/1476b93097673ad7?compose=1476ec7fef10fd01

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html



 Our current setup has 4 osd's per node.When a drive  fails   the
cluster is almost unusable for data entry.   I want to change our set up so
that under no circumstances ever happens.

 Network:  we use 2 IB switches and  bonding in fail over mode.
 Systems are two  Dell Poweredge r720 and Supermicro X8DT3 .

 So looking at how to do things better we will try  '#4- anti-cephalopod'
.   That is a seriously funny phrase!

We'll switch to using raid-10 or raid-6 and have one osd per node, using
high end raid controllers,  hot spares etc.

And use one Intel 200gb S3700 per node for journal

My questions:

is there a minimum number of OSD's which should be used?

should  OSD's per node be the same?

best regards, Rob


PS:  I had asked above in middle of another thread...  please ignore there.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] More problems building Ceph....

2014-07-25 Thread Deven Phillips
I'm trying to build DEB packages for my armhf devices, but my most recent
efforts are dying. Anny suggestions would be MOST welcome!

make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
com/ceph/fs/CephNativeLoader.class -C java
com/ceph/fs/CephNotMountedException.class -C java
com/ceph/fs/CephFileAlreadyExistsException.class -C java
com/ceph/fs/CephAlreadyMountedException.class -C java
com/ceph/fs/CephNotDirectoryException.class -C java
com/ceph/fs/CephPoolException.class -C java
com/ceph/fs/CephFileExtent.class -C java com/ceph/crush/Bucket.class
export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
javac -source 1.5 -target 1.5 -Xlint:-options
test/com/ceph/fs/*.java
jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class -C
test com/ceph/fs/CephMountCreateTest.class -C test
com/ceph/fs/CephMountTest.class -C test com/ceph/fs/CephUnmountedTest.class
-C test com/ceph/fs/CephAllTests.class
make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
Making all in libs3
make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
make[4]: *** No rule to make target `all'.  Stop.
make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/cubie/Source/ceph/src'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/cubie/Source/ceph/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/cubie/Source/ceph'
make: *** [build-stamp] Error 2
dpkg-buildpackage: error: debian/rules build gave error exit status 2

Thanks in advance!

Deven
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Noah Watkins
Make sure you are intializing the sub-modules.. the autogen.sh script
should probably notify users when these are missing and/or initialize
them automatically..

git submodule init
git submodule update

or alternatively, git clone --recursive ...

On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
 wrote:
> I'm trying to build DEB packages for my armhf devices, but my most recent
> efforts are dying. Anny suggestions would be MOST welcome!
>
> make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
> jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
> com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
> com/ceph/fs/CephNativeLoader.class -C java
> com/ceph/fs/CephNotMountedException.class -C java
> com/ceph/fs/CephFileAlreadyExistsException.class -C java
> com/ceph/fs/CephAlreadyMountedException.class -C java
> com/ceph/fs/CephNotDirectoryException.class -C java
> com/ceph/fs/CephPoolException.class -C java com/ceph/fs/CephFileExtent.class
> -C java com/ceph/crush/Bucket.class
> export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
> javac -source 1.5 -target 1.5 -Xlint:-options
> test/com/ceph/fs/*.java
> jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class -C
> test com/ceph/fs/CephMountCreateTest.class -C test
> com/ceph/fs/CephMountTest.class -C test com/ceph/fs/CephUnmountedTest.class
> -C test com/ceph/fs/CephAllTests.class
> make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
> make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
> Making all in libs3
> make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
> make[4]: *** No rule to make target `all'.  Stop.
> make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
> make[3]: *** [all-recursive] Error 1
> make[3]: Leaving directory `/home/cubie/Source/ceph/src'
> make[2]: *** [all] Error 2
> make[2]: Leaving directory `/home/cubie/Source/ceph/src'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/home/cubie/Source/ceph'
> make: *** [build-stamp] Error 2
> dpkg-buildpackage: error: debian/rules build gave error exit status 2
>
> Thanks in advance!
>
> Deven
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Noah Watkins
Oh, it looks like autogen.sh is smart about that now. If you using the
latest master, my suggestion may not be the solution.

On Fri, Jul 25, 2014 at 11:51 AM, Noah Watkins  wrote:
> Make sure you are intializing the sub-modules.. the autogen.sh script
> should probably notify users when these are missing and/or initialize
> them automatically..
>
> git submodule init
> git submodule update
>
> or alternatively, git clone --recursive ...
>
> On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
>  wrote:
>> I'm trying to build DEB packages for my armhf devices, but my most recent
>> efforts are dying. Anny suggestions would be MOST welcome!
>>
>> make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
>> jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
>> com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
>> com/ceph/fs/CephNativeLoader.class -C java
>> com/ceph/fs/CephNotMountedException.class -C java
>> com/ceph/fs/CephFileAlreadyExistsException.class -C java
>> com/ceph/fs/CephAlreadyMountedException.class -C java
>> com/ceph/fs/CephNotDirectoryException.class -C java
>> com/ceph/fs/CephPoolException.class -C java com/ceph/fs/CephFileExtent.class
>> -C java com/ceph/crush/Bucket.class
>> export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
>> javac -source 1.5 -target 1.5 -Xlint:-options
>> test/com/ceph/fs/*.java
>> jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class -C
>> test com/ceph/fs/CephMountCreateTest.class -C test
>> com/ceph/fs/CephMountTest.class -C test com/ceph/fs/CephUnmountedTest.class
>> -C test com/ceph/fs/CephAllTests.class
>> make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
>> Making all in libs3
>> make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
>> make[4]: *** No rule to make target `all'.  Stop.
>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
>> make[3]: *** [all-recursive] Error 1
>> make[3]: Leaving directory `/home/cubie/Source/ceph/src'
>> make[2]: *** [all] Error 2
>> make[2]: Leaving directory `/home/cubie/Source/ceph/src'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory `/home/cubie/Source/ceph'
>> make: *** [build-stamp] Error 2
>> dpkg-buildpackage: error: debian/rules build gave error exit status 2
>>
>> Thanks in advance!
>>
>> Deven
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Abhishek L

Noah Watkins writes:

> Oh, it looks like autogen.sh is smart about that now. If you using the
> latest master, my suggestion may not be the solution.
>
> On Fri, Jul 25, 2014 at 11:51 AM, Noah Watkins  
> wrote:
>> Make sure you are intializing the sub-modules.. the autogen.sh script
>> should probably notify users when these are missing and/or initialize
>> them automatically..
>>
>> git submodule init
>> git submodule update
>>
>> or alternatively, git clone --recursive ...
>>
>> On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
>>  wrote:
>>> I'm trying to build DEB packages for my armhf devices, but my most recent
>>> efforts are dying. Anny suggestions would be MOST welcome!
>>>
>>> make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
>>> jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
>>> com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
>>> com/ceph/fs/CephNativeLoader.class -C java
>>> com/ceph/fs/CephNotMountedException.class -C java
>>> com/ceph/fs/CephFileAlreadyExistsException.class -C java
>>> com/ceph/fs/CephAlreadyMountedException.class -C java
>>> com/ceph/fs/CephNotDirectoryException.class -C java
>>> com/ceph/fs/CephPoolException.class -C java com/ceph/fs/CephFileExtent.class
>>> -C java com/ceph/crush/Bucket.class
>>> export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
>>> javac -source 1.5 -target 1.5 -Xlint:-options
>>> test/com/ceph/fs/*.java
>>> jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class -C
>>> test com/ceph/fs/CephMountCreateTest.class -C test
>>> com/ceph/fs/CephMountTest.class -C test com/ceph/fs/CephUnmountedTest.class
>>> -C test com/ceph/fs/CephAllTests.class
>>> make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
>>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
>>> Making all in libs3
>>> make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
>>> make[4]: *** No rule to make target `all'.  Stop.
>>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
>>> make[3]: *** [all-recursive] Error 1
>>> make[3]: Leaving directory `/home/cubie/Source/ceph/src'
>>> make[2]: *** [all] Error 2
>>> make[2]: Leaving directory `/home/cubie/Source/ceph/src'
>>> make[1]: *** [all-recursive] Error 1
>>> make[1]: Leaving directory `/home/cubie/Source/ceph'
>>> make: *** [build-stamp] Error 2
>>> dpkg-buildpackage: error: debian/rules build gave error exit status 2

For me (ubuntu trusty) building via dpkg-buildpackage seems to work
perfectly fine.

However the other day when I tried building libs3 as a standalone, it
errored out. Here I found that the Makefile has a default version number
of trunk.trunk which breaks the debian rules. Changing that to a numeric
value seemed to work.

>>>
>>> Thanks in advance!
>>>
>>> Deven
>>>
--
Abhishek

>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Deven Phillips
I'm using v0.82 tagged version from Git


On Fri, Jul 25, 2014 at 2:54 PM, Noah Watkins 
wrote:

> Oh, it looks like autogen.sh is smart about that now. If you using the
> latest master, my suggestion may not be the solution.
>
> On Fri, Jul 25, 2014 at 11:51 AM, Noah Watkins 
> wrote:
> > Make sure you are intializing the sub-modules.. the autogen.sh script
> > should probably notify users when these are missing and/or initialize
> > them automatically..
> >
> > git submodule init
> > git submodule update
> >
> > or alternatively, git clone --recursive ...
> >
> > On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
> >  wrote:
> >> I'm trying to build DEB packages for my armhf devices, but my most
> recent
> >> efforts are dying. Anny suggestions would be MOST welcome!
> >>
> >> make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
> >> jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
> >> com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
> >> com/ceph/fs/CephNativeLoader.class -C java
> >> com/ceph/fs/CephNotMountedException.class -C java
> >> com/ceph/fs/CephFileAlreadyExistsException.class -C java
> >> com/ceph/fs/CephAlreadyMountedException.class -C java
> >> com/ceph/fs/CephNotDirectoryException.class -C java
> >> com/ceph/fs/CephPoolException.class -C java
> com/ceph/fs/CephFileExtent.class
> >> -C java com/ceph/crush/Bucket.class
> >> export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
> >> javac -source 1.5 -target 1.5 -Xlint:-options
> >> test/com/ceph/fs/*.java
> >> jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class
> -C
> >> test com/ceph/fs/CephMountCreateTest.class -C test
> >> com/ceph/fs/CephMountTest.class -C test
> com/ceph/fs/CephUnmountedTest.class
> >> -C test com/ceph/fs/CephAllTests.class
> >> make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
> >> make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
> >> Making all in libs3
> >> make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
> >> make[4]: *** No rule to make target `all'.  Stop.
> >> make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
> >> make[3]: *** [all-recursive] Error 1
> >> make[3]: Leaving directory `/home/cubie/Source/ceph/src'
> >> make[2]: *** [all] Error 2
> >> make[2]: Leaving directory `/home/cubie/Source/ceph/src'
> >> make[1]: *** [all-recursive] Error 1
> >> make[1]: Leaving directory `/home/cubie/Source/ceph'
> >> make: *** [build-stamp] Error 2
> >> dpkg-buildpackage: error: debian/rules build gave error exit status 2
> >>
> >> Thanks in advance!
> >>
> >> Deven
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Deven Phillips
Noah,

 That DOES appear to have been at least part of the problem... The
src/lib3/ directory was empty and when I tried to use submodules to update
it I got errors about non-empty directories... Trying to fix that now..

Thanks!

Deven


On Fri, Jul 25, 2014 at 2:51 PM, Noah Watkins 
wrote:

> Make sure you are intializing the sub-modules.. the autogen.sh script
> should probably notify users when these are missing and/or initialize
> them automatically..
>
> git submodule init
> git submodule update
>
> or alternatively, git clone --recursive ...
>
> On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
>  wrote:
> > I'm trying to build DEB packages for my armhf devices, but my most recent
> > efforts are dying. Anny suggestions would be MOST welcome!
> >
> > make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
> > jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
> > com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
> > com/ceph/fs/CephNativeLoader.class -C java
> > com/ceph/fs/CephNotMountedException.class -C java
> > com/ceph/fs/CephFileAlreadyExistsException.class -C java
> > com/ceph/fs/CephAlreadyMountedException.class -C java
> > com/ceph/fs/CephNotDirectoryException.class -C java
> > com/ceph/fs/CephPoolException.class -C java
> com/ceph/fs/CephFileExtent.class
> > -C java com/ceph/crush/Bucket.class
> > export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
> > javac -source 1.5 -target 1.5 -Xlint:-options
> > test/com/ceph/fs/*.java
> > jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class
> -C
> > test com/ceph/fs/CephMountCreateTest.class -C test
> > com/ceph/fs/CephMountTest.class -C test
> com/ceph/fs/CephUnmountedTest.class
> > -C test com/ceph/fs/CephAllTests.class
> > make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
> > make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
> > Making all in libs3
> > make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
> > make[4]: *** No rule to make target `all'.  Stop.
> > make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
> > make[3]: *** [all-recursive] Error 1
> > make[3]: Leaving directory `/home/cubie/Source/ceph/src'
> > make[2]: *** [all] Error 2
> > make[2]: Leaving directory `/home/cubie/Source/ceph/src'
> > make[1]: *** [all-recursive] Error 1
> > make[1]: Leaving directory `/home/cubie/Source/ceph'
> > make: *** [build-stamp] Error 2
> > dpkg-buildpackage: error: debian/rules build gave error exit status 2
> >
> > Thanks in advance!
> >
> > Deven
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More problems building Ceph....

2014-07-25 Thread Noah Watkins
You can rm -rf those submodule directories and then re-run submodule
init/update to put the tree in a good state without re-cloning.

On Fri, Jul 25, 2014 at 12:10 PM, Deven Phillips
 wrote:
> Noah,
>
>  That DOES appear to have been at least part of the problem... The
> src/lib3/ directory was empty and when I tried to use submodules to update
> it I got errors about non-empty directories... Trying to fix that now..
>
> Thanks!
>
> Deven
>
>
> On Fri, Jul 25, 2014 at 2:51 PM, Noah Watkins 
> wrote:
>>
>> Make sure you are intializing the sub-modules.. the autogen.sh script
>> should probably notify users when these are missing and/or initialize
>> them automatically..
>>
>> git submodule init
>> git submodule update
>>
>> or alternatively, git clone --recursive ...
>>
>> On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips
>>  wrote:
>> > I'm trying to build DEB packages for my armhf devices, but my most
>> > recent
>> > efforts are dying. Anny suggestions would be MOST welcome!
>> >
>> > make[5]: Entering directory `/home/cubie/Source/ceph/src/java'
>> > jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java
>> > com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java
>> > com/ceph/fs/CephNativeLoader.class -C java
>> > com/ceph/fs/CephNotMountedException.class -C java
>> > com/ceph/fs/CephFileAlreadyExistsException.class -C java
>> > com/ceph/fs/CephAlreadyMountedException.class -C java
>> > com/ceph/fs/CephNotDirectoryException.class -C java
>> > com/ceph/fs/CephPoolException.class -C java
>> > com/ceph/fs/CephFileExtent.class
>> > -C java com/ceph/crush/Bucket.class
>> > export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \
>> > javac -source 1.5 -target 1.5 -Xlint:-options
>> > test/com/ceph/fs/*.java
>> > jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class
>> > -C
>> > test com/ceph/fs/CephMountCreateTest.class -C test
>> > com/ceph/fs/CephMountTest.class -C test
>> > com/ceph/fs/CephUnmountedTest.class
>> > -C test com/ceph/fs/CephAllTests.class
>> > make[5]: Leaving directory `/home/cubie/Source/ceph/src/java'
>> > make[4]: Leaving directory `/home/cubie/Source/ceph/src/java'
>> > Making all in libs3
>> > make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3'
>> > make[4]: *** No rule to make target `all'.  Stop.
>> > make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3'
>> > make[3]: *** [all-recursive] Error 1
>> > make[3]: Leaving directory `/home/cubie/Source/ceph/src'
>> > make[2]: *** [all] Error 2
>> > make[2]: Leaving directory `/home/cubie/Source/ceph/src'
>> > make[1]: *** [all-recursive] Error 1
>> > make[1]: Leaving directory `/home/cubie/Source/ceph'
>> > make: *** [build-stamp] Error 2
>> > dpkg-buildpackage: error: debian/rules build gave error exit status 2
>> >
>> > Thanks in advance!
>> >
>> > Deven
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-25 Thread Christian Balzer
On Fri, 25 Jul 2014 13:14:59 -0500 Schweiss, Chip wrote:

> On Fri, Jul 25, 2014 at 12:04 PM, Christian Balzer  wrote:
> 
> >
> > Well, if I read that link and the actual manual correctly, the most one
> > can hope to get from this is 48Gb/s (2 mini-SAS with 4 lanes each)
> > which is short of what 45 regular HDDs can dish out (or take in).
> > And that's ignoring the the inherent deficiencies when dealing with
> > port expanders.
> >
> >
> Actually the limit is 96Gb/s  The front and back have independent SAS
> expanders.  There are 4 internal to external mini-SAS cables.
> 
If you read the section E of the manual closely and stare at the back of
the case you will see that while there are indeed 4 external SAS
connectors right next to the power supply, only 2 of those are inbound
(upstream, HBA) and the other 2 outbound, downstream ones.

So my number stands, 4 lanes at 6Gb/s times 2 = 48Gb/s.

Which also means that if one were to put faster drives in there, the
backside with slightly more bandwidth would be the preferred location.

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anti-cephalopod question

2014-07-25 Thread Christian Balzer

Hello,

actually replying in the other thread was fine by me, it was after
relevant in a sense to it.
And you mentioned something important there, which you didn't mention
below, that you're coming from DRBD with a lot of experience there.

So do I and Ceph/RBD simply isn't (and probably never will be) an adequate
replacement for DRBD in some use cases. 
I certainly plan to keep deploying DRBD where it makes more sense
(IOPS/speed), while migrating everything else to Ceph.

Anyway, lets look at your mail:

On Fri, 25 Jul 2014 14:33:56 -0400 Robert Fantini wrote:

> I've a question regarding advice from these threads:
> https://mail.google.com/mail/u/0/#label/ceph/1476b93097673ad7?compose=1476ec7fef10fd01
> 
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
> 
> 
> 
>  Our current setup has 4 osd's per node.When a drive  fails   the
> cluster is almost unusable for data entry.   I want to change our set up
> so that under no circumstances ever happens.
> 

While you can pretty much avoid this from happening, your cluster should
be able to handle a recovery.
While Ceph is a bit more hamfisted than DRBD and definitely needs more
controls and tuning to make recoveries have less of an impact you would
see something similar with DRBD and badly configured recovery speeds.

In essence, if your current setup can't handle the loss of a single disk,
what happens if a node fails?
You will need to design (HW) and configure (various Ceph options) your
cluster to handle these things because at some point a recovery might be
unavoidable. 

To prevent recoveries based on failed disks, use RAID, for node failures
you could permanently set OSD noout or have a monitoring software do that
when it detects a node failure.

>  Network:  we use 2 IB switches and  bonding in fail over mode.
>  Systems are two  Dell Poweredge r720 and Supermicro X8DT3 .
> 

I'm confused. Those Dells tend to have 8 drive bays normally, don't they?
So you're just using 4 HDDs for OSDs? No SSD journals?
Just 2 storage nodes? 
Note that unless you do use RAIDed OSDs this leaves you vulnerable to dual
disk failures. Which will happen. 

Also that SM product number is for a motherboard, not a server, is that
your monitor host?
Anything production with data on in that you value should have 3 mon
hosts, if you can't afford dedicated ones sharing them on an OSD node
(preferably with the OS on SSDs to keep leveldb happy) is better than just
one, because if that one dies or gets corrupted, your data is inaccessible.

>  So looking at how to do things better we will try  '#4- anti-cephalopod'
> .   That is a seriously funny phrase!
> 
> We'll switch to using raid-10 or raid-6 and have one osd per node, using
> high end raid controllers,  hot spares etc.
> 
Are you still talking about the same hardware as above, just 4 HDDs for
storage? 
With 4 HDDs I'd go for RAID10 (definitely want a hotspare there), if you
have more bays use up to 12 for RAID6 with a high performance and large
HW cache controller.  

> And use one Intel 200gb S3700 per node for journal
> 
That's barely enough for 4 HDDs at 365MB/s write speed, but will do
nicely if those are in a RAID10 (half speed of individual drives). 
Keep in mind that your node will never be able to write faster than the
speed of your journal.

> My questions:
> 
> is there a minimum number of OSD's which should be used?
> 
If you have one OSD per node and the disks are RAIDed, 2 OSDs aka 2 nodes
is sufficient to begin with. 
However your performance might not be what you expect (an OSD process
seems to be incapable of doing more than 800 write IOPS). 
But with a 4 disk RAID10 (essentially 2 HDDs, so about 200 IOPS) that's
not so much of an issue. 
In my case with a 11 disk RAID6 AND the 4GB HW cache Areca controller it
certainly is rather frustrating.

In short, the more nodes (OSDs) you can deploy, the better the
performance will be. And of course in case a node dies and you don't
think it can be brought back in a sensible short time frame, having more
than 2 nodes will enable you to do a recovery/rebalance and restore your
redundancy to the desired level. 

> should  OSD's per node be the same?
> 
It is advantageous to have identical disks and OSD sizes, makes the whole
thing more predictable and you don't have to play with weights.

As for having different number of OSDs per node, consider this example:

4 nodes with 1 OSD, one node with 4 OSDs (all OSDs are of the same size).
What will happen here is that all the replicas from single OSD nodes might
wind up on the 4 OSD node. So it better have more power in all aspects
than the single OSD nodes.
Now that node fails and you decide to let things rebalance as it can't be
repaired shortly. But you cluster was half full and now it will be 100%
full and become unusable (for writes). 

So the moral of the story, deploy as much identical HW as possible. 

Christian

> best regards, Rob
> 
> 
> PS:  I had asked above in middle of another thread...  

[ceph-users] fs as btrfs and ceph journal

2014-07-25 Thread Cristian Falcas
Hello,

I'm using btrfs for OSDs and want to know if it still helps to have the
journal on a faster drive. From what I've read I'm under the impression
that with btrfs journal, the OSD journal doesn't do much work anymore.

Best regards,
Cristian Falcas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com