[ceph-users] Low write speed

2014-01-17 Thread Никитенко Виталий
Good day! Please help me solve the problem. There are the following scheme :
Server ESXi with 1Gb NICs. it has local store store2Tb and two isci storage 
connected to the second server .
The second server supermicro: two 1TB hdd (lsi 9261-8i with battery), 8 CPU 
cores, 32 GB RAM and 2 1Gb NICs . On /dev/sda installed ubuntu 12 and 
ceph-emperor. /dev/sdb disk placed under osd.0.
What i do next:
  # rbd create esxi
  # rbd map esxi

Get /dev/rbd1 which shared using iscsitarget

  # cat ietd.conf
  Target iqn.2014-01.ru.ceph: rados.iscsi.001
Lun 0 Path = / dev/rbd1, Type = blockio, ScsiId = f817ab
  Target iqn.2014-01.ru.ceph: rados.iscsi.002
Lun 1 Path = / opt/storlun0.bin, Type = fileio, ScsiId = lun1, ScsiSN = lun1

For test I also create iscsi storage on /dev/sda (Lun1).
When migrating a virtual machine from store2Tb to Lun0 (ceph) the rate of 
migration of 400-450 Mbit/second.
When migrating a VM from store2Tb to Lun1 (ubuntu file) then the rate of 
migration of 800-900 Mbit / second.
>From this I conclude that the rate is not limited by disk(controller) and not 
>to the network.
Tried osd format to ext4 and xfs and btrfs but same speed. For me, speed is 
very important , especially since the plan
translate 10Gb network links.
Thanks.
Vitaliy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Low write speed

2014-01-17 Thread Wido den Hollander

On 01/17/2014 10:01 AM, Никитенко Виталий wrote:

Good day! Please help me solve the problem. There are the following scheme :
Server ESXi with 1Gb NICs. it has local store store2Tb and two isci storage 
connected to the second server .
The second server supermicro: two 1TB hdd (lsi 9261-8i with battery), 8 CPU 
cores, 32 GB RAM and 2 1Gb NICs . On /dev/sda installed ubuntu 12 and 
ceph-emperor. /dev/sdb disk placed under osd.0.


How do you do journaling?


What i do next:
   # rbd create esxi
   # rbd map esxi

Get /dev/rbd1 which shared using iscsitarget

   # cat ietd.conf
   Target iqn.2014-01.ru.ceph: rados.iscsi.001
 Lun 0 Path = / dev/rbd1, Type = blockio, ScsiId = f817ab
   Target iqn.2014-01.ru.ceph: rados.iscsi.002
 Lun 1 Path = / opt/storlun0.bin, Type = fileio, ScsiId = lun1, ScsiSN = 
lun1

For test I also create iscsi storage on /dev/sda (Lun1).
When migrating a virtual machine from store2Tb to Lun0 (ceph) the rate of 
migration of 400-450 Mbit/second.
When migrating a VM from store2Tb to Lun1 (ubuntu file) then the rate of 
migration of 800-900 Mbit / second.
 From this I conclude that the rate is not limited by disk(controller) and not 
to the network.
Tried osd format to ext4 and xfs and btrfs but same speed. For me, speed is 
very important , especially since the plan
translate 10Gb network links.


Have you tried TGT instead? It uses librbd instead of using the Kernel 
layers for RBD and iSCSI: 
http://ceph.com/dev-notes/updates-to-ceph-tgt-iscsi-support/


Have you also tried to run a rados benchmark? (rados bench)

Also, be aware that Ceph excels in it's parallel performance. You 
shouldn't look at the performance of a single "LUN" or RBD image that 
much, it's much more interesting to see the aggegrated performance of 10 
or maybe 100 "LUNs" together.



Thanks.
Vitaliy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.75 released

2014-01-17 Thread Ilya Dryomov
On Fri, Jan 17, 2014 at 2:05 AM, Christian Balzer  wrote:
> On Thu, 16 Jan 2014 15:51:17 +0200 Ilya Dryomov wrote:
>
>> On Wed, Jan 15, 2014 at 5:42 AM, Sage Weil  wrote:
>> >
>> > [...]
>> >
>> > * rbd: support for 4096 mapped devices, up from ~250 (Ilya Dryomov)
>>
>> Just a note, v0.75 simply adds some of the infrastructure, the actual
>> support for this will arrive with kernel 3.14.  The theoretical limit
>> is 65536 mapped devices, although I admit I haven't tried mapping more
>> than ~4000 at once.
>>
> Just for clarification, this is for the client side when using the kernel
> module, right?
>
> Not looking at more than about 150 devices per compute node now, but that
> might change and there is also the case of failovers...

Yes, this is how many 'rbd map ...'s a single rbd kernel module (and
therefore a single compute node) can handle.  Kernels 3.13 and below
can handle ~130-150, depending on the machine.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.75 released

2014-01-17 Thread Ilya Dryomov
On Fri, Jan 17, 2014 at 11:20 AM, Ilya Dryomov  wrote:
> On Fri, Jan 17, 2014 at 2:05 AM, Christian Balzer  wrote:
>> On Thu, 16 Jan 2014 15:51:17 +0200 Ilya Dryomov wrote:
>>
>>> On Wed, Jan 15, 2014 at 5:42 AM, Sage Weil  wrote:
>>> >
>>> > [...]
>>> >
>>> > * rbd: support for 4096 mapped devices, up from ~250 (Ilya Dryomov)
>>>
>>> Just a note, v0.75 simply adds some of the infrastructure, the actual
>>> support for this will arrive with kernel 3.14.  The theoretical limit
>>> is 65536 mapped devices, although I admit I haven't tried mapping more
>>> than ~4000 at once.
>>>
>> Just for clarification, this is for the client side when using the kernel
>> module, right?
>>
>> Not looking at more than about 150 devices per compute node now, but that
>> might change and there is also the case of failovers...
>
> Yes, this is how many 'rbd map ...'s a single rbd kernel module (and
> therefore a single compute node) can handle.  Kernels 3.13 and below
> can handle ~130-150, depending on the machine.

Sorry, typoed.  ~230-250.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Low write speed

2014-01-17 Thread Ирек Фасихов
Hi, Виталий.
Whether a sufficient number of PGS?


2014/1/17 Никитенко Виталий 

> Good day! Please help me solve the problem. There are the following scheme
> :
> Server ESXi with 1Gb NICs. it has local store store2Tb and two isci
> storage connected to the second server .
> The second server supermicro: two 1TB hdd (lsi 9261-8i with battery), 8
> CPU cores, 32 GB RAM and 2 1Gb NICs . On /dev/sda installed ubuntu 12 and
> ceph-emperor. /dev/sdb disk placed under osd.0.
> What i do next:
>   # rbd create esxi
>   # rbd map esxi
>
> Get /dev/rbd1 which shared using iscsitarget
>
>   # cat ietd.conf
>   Target iqn.2014-01.ru.ceph: rados.iscsi.001
> Lun 0 Path = / dev/rbd1, Type = blockio, ScsiId = f817ab
>   Target iqn.2014-01.ru.ceph: rados.iscsi.002
> Lun 1 Path = / opt/storlun0.bin, Type = fileio, ScsiId = lun1, ScsiSN
> = lun1
>
> For test I also create iscsi storage on /dev/sda (Lun1).
> When migrating a virtual machine from store2Tb to Lun0 (ceph) the rate of
> migration of 400-450 Mbit/second.
> When migrating a VM from store2Tb to Lun1 (ubuntu file) then the rate of
> migration of 800-900 Mbit / second.
> From this I conclude that the rate is not limited by disk(controller) and
> not to the network.
> Tried osd format to ext4 and xfs and btrfs but same speed. For me, speed
> is very important , especially since the plan
> translate 10Gb network links.
> Thanks.
> Vitaliy
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mon ip addr is not followed ceph config file

2014-01-17 Thread Tim Zhang
Hi guys,
I use ceph-deploy to deploy my ceph cluster.
This is my config file:
-
[global]
osd pool default size = 3
auth_service_required = none
filestore_xattr_use_omap = true
journal zero on create = true
auth_client_required = none
auth_cluster_required = none
mon_host = 192.168.1.172,192.168.1.130,192.168.1.115
osd_journal_size = 1024
public_network = 192.168.1.0/24
mon_initial_members = node30, node31, node32
cluster_network = 192.168.1.0/24
fsid = da79afb2-d85e-406a-b05b-80eaaac2e179
-

after deploying , the cluster is unhealthy and I find mon addr's is not as
the config file setting, and the dedicated mon addrs(mon_host ) are in the
extra_probe_peers fields according to the command:
ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node30.asok
mon_status;
and seems the cluster using the other ip addr on the hosts. Thats strange,
can anyone give some suggestions?

the output is as following:
[node30][INFO  ] Running command: ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.node30.asok mon_status
[node30][DEBUG ]

[node30][DEBUG ] status for monitor: mon.node30
[node30][DEBUG ] {
[node30][DEBUG ]   "election_epoch": 0,
[node30][DEBUG ]   "extra_probe_peers": [
[node30][DEBUG ] "192.168.1.115:6789/0",
[node30][DEBUG ] "192.168.1.130:6789/0",
[node30][DEBUG ] "192.168.1.172:6789/0"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "monmap": {
[node30][DEBUG ] "created": "0.00",
[node30][DEBUG ] "epoch": 0,
[node30][DEBUG ] "fsid": "0d00a742-7ac1-4535-b0dc-26f5a0fe7924",
[node30][DEBUG ] "modified": "0.00",
[node30][DEBUG ] "mons": [
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "192.168.1.173:6789/0",
[node30][DEBUG ] "name": "node30",
[node30][DEBUG ] "rank": 0
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/1",
[node30][DEBUG ] "name": "node31",
[node30][DEBUG ] "rank": 1
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/2",
[node30][DEBUG ] "name": "node32",
[node30][DEBUG ] "rank": 2
[node30][DEBUG ]   }
[node30][DEBUG ] ]
[node30][DEBUG ]   },
[node30][DEBUG ]   "name": "node30",
[node30][DEBUG ]   "outside_quorum": [
[node30][DEBUG ] "node30"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "quorum": [],
[node30][DEBUG ]   "rank": 0,
[node30][DEBUG ]   "state": "probing",
[node30][DEBUG ]   "sync_provider": []
[node30][DEBUG ] }
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon ip addr is not followed ceph config file

2014-01-17 Thread Wido den Hollander

On 01/17/2014 12:46 PM, Tim Zhang wrote:

Hi guys,
I use ceph-deploy to deploy my ceph cluster.
This is my config file:
-
[global]
osd pool default size = 3
auth_service_required = none
filestore_xattr_use_omap = true
journal zero on create = true
auth_client_required = none
auth_cluster_required = none
mon_host = 192.168.1.172,192.168.1.130,192.168.1.115
osd_journal_size = 1024
public_network = 192.168.1.0/24 
mon_initial_members = node30, node31, node32
cluster_network = 192.168.1.0/24 
fsid = da79afb2-d85e-406a-b05b-80eaaac2e179
-

after deploying , the cluster is unhealthy and I find mon addr's is not
as the config file setting, and the dedicated mon addrs(mon_host ) are
in the extra_probe_peers fields according to the command:
ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node30.asok
mon_status;
and seems the cluster using the other ip addr on the hosts. Thats
strange, can anyone give some suggestions?



What version of Ceph is this? I've seen something similar with Dumpling 
where the mon would bind to 0.0.0.0:6800. I haven't been able to find 
the root cause yet.


Wido


the output is as following:
[node30][INFO  ] Running command: ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.node30.asok mon_status
[node30][DEBUG ]

[node30][DEBUG ] status for monitor: mon.node30
[node30][DEBUG ] {
[node30][DEBUG ]   "election_epoch": 0,
[node30][DEBUG ]   "extra_probe_peers": [
[node30][DEBUG ] "192.168.1.115:6789/0 ",
[node30][DEBUG ] "192.168.1.130:6789/0 ",
[node30][DEBUG ] "192.168.1.172:6789/0 "
[node30][DEBUG ]   ],
[node30][DEBUG ]   "monmap": {
[node30][DEBUG ] "created": "0.00",
[node30][DEBUG ] "epoch": 0,
[node30][DEBUG ] "fsid": "0d00a742-7ac1-4535-b0dc-26f5a0fe7924",
[node30][DEBUG ] "modified": "0.00",
[node30][DEBUG ] "mons": [
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "192.168.1.173:6789/0
",
[node30][DEBUG ] "name": "node30",
[node30][DEBUG ] "rank": 0
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/1 ",
[node30][DEBUG ] "name": "node31",
[node30][DEBUG ] "rank": 1
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/2 ",
[node30][DEBUG ] "name": "node32",
[node30][DEBUG ] "rank": 2
[node30][DEBUG ]   }
[node30][DEBUG ] ]
[node30][DEBUG ]   },
[node30][DEBUG ]   "name": "node30",
[node30][DEBUG ]   "outside_quorum": [
[node30][DEBUG ] "node30"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "quorum": [],
[node30][DEBUG ]   "rank": 0,
[node30][DEBUG ]   "state": "probing",
[node30][DEBUG ]   "sync_provider": []
[node30][DEBUG ] }


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon ip addr is not followed ceph config file

2014-01-17 Thread Joao Eduardo Luis

On 01/17/2014 12:02 PM, Wido den Hollander wrote:

On 01/17/2014 12:46 PM, Tim Zhang wrote:

Hi guys,
I use ceph-deploy to deploy my ceph cluster.
This is my config file:
-

[global]
osd pool default size = 3
auth_service_required = none
filestore_xattr_use_omap = true
journal zero on create = true
auth_client_required = none
auth_cluster_required = none
mon_host = 192.168.1.172,192.168.1.130,192.168.1.115
osd_journal_size = 1024
public_network = 192.168.1.0/24 
mon_initial_members = node30, node31, node32
cluster_network = 192.168.1.0/24 
fsid = da79afb2-d85e-406a-b05b-80eaaac2e179
-


after deploying , the cluster is unhealthy and I find mon addr's is not
as the config file setting, and the dedicated mon addrs(mon_host ) are
in the extra_probe_peers fields according to the command:
ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node30.asok
mon_status;
and seems the cluster using the other ip addr on the hosts. Thats
strange, can anyone give some suggestions?



What version of Ceph is this? I've seen something similar with Dumpling
where the mon would bind to 0.0.0.0:6800. I haven't been able to find
the root cause yet.

Wido


http://tracker.ceph.com/issues/5804

Any info on how to reproduce this thing is welcome.

  -Joao




the output is as following:
[node30][INFO  ] Running command: ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.node30.asok mon_status
[node30][DEBUG ]


[node30][DEBUG ] status for monitor: mon.node30
[node30][DEBUG ] {
[node30][DEBUG ]   "election_epoch": 0,
[node30][DEBUG ]   "extra_probe_peers": [
[node30][DEBUG ] "192.168.1.115:6789/0
",
[node30][DEBUG ] "192.168.1.130:6789/0
",
[node30][DEBUG ] "192.168.1.172:6789/0 "
[node30][DEBUG ]   ],
[node30][DEBUG ]   "monmap": {
[node30][DEBUG ] "created": "0.00",
[node30][DEBUG ] "epoch": 0,
[node30][DEBUG ] "fsid": "0d00a742-7ac1-4535-b0dc-26f5a0fe7924",
[node30][DEBUG ] "modified": "0.00",
[node30][DEBUG ] "mons": [
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "192.168.1.173:6789/0
",
[node30][DEBUG ] "name": "node30",
[node30][DEBUG ] "rank": 0
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/1 ",
[node30][DEBUG ] "name": "node31",
[node30][DEBUG ] "rank": 1
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/2 ",
[node30][DEBUG ] "name": "node32",
[node30][DEBUG ] "rank": 2
[node30][DEBUG ]   }
[node30][DEBUG ] ]
[node30][DEBUG ]   },
[node30][DEBUG ]   "name": "node30",
[node30][DEBUG ]   "outside_quorum": [
[node30][DEBUG ] "node30"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "quorum": [],
[node30][DEBUG ]   "rank": 0,
[node30][DEBUG ]   "state": "probing",
[node30][DEBUG ]   "sync_provider": []
[node30][DEBUG ] }


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cepfs: Minimal deployment

2014-01-17 Thread Iban Cabrillo
Dear,
  we are studying the possibility to migrate our FS in the next year to
cephfs. I know that it is not prepare for production environments yet, but
we are planning to play with it in the next months deploying a basic
testbed.
  Reading the documentation, I see 3 mons, 1 mds and several ods's (both in
physical machines..I have understood). Is this true?
  On the other hand I do not understand the fail-over mechanism for clients
when have mounted a FS, looking at documentation :

   *ceph-fuse* [ -m *monaddr*:*port* ] *mountpoint* [ *fuse options* ]
You have to specify (hardcode) the "*monaddr*:

*port", if this mon (ip) is down, what happen, Do you lost the fs on that
node?, or there is a generic dns-rrd implementation for mons?? *

*  Is there any implementation for "tiering" or "HSM" at software level, I
mean, can I mix different type of disk (ssds and SATA) on diferent pools,
and migrate data between them automatically (most used, size, last time
access)  *



*Please could anyone clarify to me this point?*

*Regards, I*

-- 

Iban Cabrillo Bartolome
Instituto de Fisica de Cantabria (IFCA)
Santander, Spain
Tel: +34942200969

Bertrand Russell:
*"El problema con el mundo es que los estúpidos están seguros de todo y los
inteligentes están llenos de dudas*"
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon ip addr is not followed ceph config file

2014-01-17 Thread Wido den Hollander

On 01/17/2014 01:29 PM, Joao Eduardo Luis wrote:

On 01/17/2014 12:02 PM, Wido den Hollander wrote:

On 01/17/2014 12:46 PM, Tim Zhang wrote:

Hi guys,
I use ceph-deploy to deploy my ceph cluster.
This is my config file:
-


[global]
osd pool default size = 3
auth_service_required = none
filestore_xattr_use_omap = true
journal zero on create = true
auth_client_required = none
auth_cluster_required = none
mon_host = 192.168.1.172,192.168.1.130,192.168.1.115
osd_journal_size = 1024
public_network = 192.168.1.0/24 
mon_initial_members = node30, node31, node32
cluster_network = 192.168.1.0/24 
fsid = da79afb2-d85e-406a-b05b-80eaaac2e179
-



after deploying , the cluster is unhealthy and I find mon addr's is not
as the config file setting, and the dedicated mon addrs(mon_host ) are
in the extra_probe_peers fields according to the command:
ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node30.asok
mon_status;
and seems the cluster using the other ip addr on the hosts. Thats
strange, can anyone give some suggestions?



What version of Ceph is this? I've seen something similar with Dumpling
where the mon would bind to 0.0.0.0:6800. I haven't been able to find
the root cause yet.

Wido


http://tracker.ceph.com/issues/5804

Any info on how to reproduce this thing is welcome.



Done. I added a comment to the issue regarding how I saw it happen. I 
can reproduce it any moment using those VirtualBox images.


Wido


   -Joao




the output is as following:
[node30][INFO  ] Running command: ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.node30.asok mon_status
[node30][DEBUG ]



[node30][DEBUG ] status for monitor: mon.node30
[node30][DEBUG ] {
[node30][DEBUG ]   "election_epoch": 0,
[node30][DEBUG ]   "extra_probe_peers": [
[node30][DEBUG ] "192.168.1.115:6789/0
",
[node30][DEBUG ] "192.168.1.130:6789/0
",
[node30][DEBUG ] "192.168.1.172:6789/0
"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "monmap": {
[node30][DEBUG ] "created": "0.00",
[node30][DEBUG ] "epoch": 0,
[node30][DEBUG ] "fsid": "0d00a742-7ac1-4535-b0dc-26f5a0fe7924",
[node30][DEBUG ] "modified": "0.00",
[node30][DEBUG ] "mons": [
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "192.168.1.173:6789/0
",
[node30][DEBUG ] "name": "node30",
[node30][DEBUG ] "rank": 0
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/1 ",
[node30][DEBUG ] "name": "node31",
[node30][DEBUG ] "rank": 1
[node30][DEBUG ]   },
[node30][DEBUG ]   {
[node30][DEBUG ] "addr": "0.0.0.0:0/2 ",
[node30][DEBUG ] "name": "node32",
[node30][DEBUG ] "rank": 2
[node30][DEBUG ]   }
[node30][DEBUG ] ]
[node30][DEBUG ]   },
[node30][DEBUG ]   "name": "node30",
[node30][DEBUG ]   "outside_quorum": [
[node30][DEBUG ] "node30"
[node30][DEBUG ]   ],
[node30][DEBUG ]   "quorum": [],
[node30][DEBUG ]   "rank": 0,
[node30][DEBUG ]   "state": "probing",
[node30][DEBUG ]   "sync_provider": []
[node30][DEBUG ] }


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com










--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon ip addr is not followed ceph config file

2014-01-17 Thread Alfredo Deza
On Fri, Jan 17, 2014 at 8:41 AM, Wido den Hollander  wrote:
> On 01/17/2014 01:29 PM, Joao Eduardo Luis wrote:
>>
>> On 01/17/2014 12:02 PM, Wido den Hollander wrote:
>>>
>>> On 01/17/2014 12:46 PM, Tim Zhang wrote:

 Hi guys,
 I use ceph-deploy to deploy my ceph cluster.
 This is my config file:

 -


 [global]
 osd pool default size = 3
 auth_service_required = none
 filestore_xattr_use_omap = true
 journal zero on create = true
 auth_client_required = none
 auth_cluster_required = none
 mon_host = 192.168.1.172,192.168.1.130,192.168.1.115
 osd_journal_size = 1024
 public_network = 192.168.1.0/24 
 mon_initial_members = node30, node31, node32
 cluster_network = 192.168.1.0/24 
 fsid = da79afb2-d85e-406a-b05b-80eaaac2e179

 -



 after deploying , the cluster is unhealthy and I find mon addr's is not
 as the config file setting, and the dedicated mon addrs(mon_host ) are
 in the extra_probe_peers fields according to the command:
 ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node30.asok
 mon_status;
 and seems the cluster using the other ip addr on the hosts. Thats
 strange, can anyone give some suggestions?

>>>
>>> What version of Ceph is this? I've seen something similar with Dumpling
>>> where the mon would bind to 0.0.0.0:6800. I haven't been able to find
>>> the root cause yet.
>>>
>>> Wido
>>
>>
>> http://tracker.ceph.com/issues/5804
>>
>> Any info on how to reproduce this thing is welcome.
>>
>
> Done. I added a comment to the issue regarding how I saw it happen. I can
> reproduce it any moment using those VirtualBox images.
>
> Wido
>
>
>>-Joao
>>
>>>
 the output is as following:

This input is not the full output that would come out from ceph-deploy.

One of the useful things that ceph-deploy does is that it tries to be
very granular
as to what is going on so that debugging is easier.

Would you mind pasting the complete log output and maybe add it to the
ticket as well?


 [node30][INFO  ] Running command: ceph --cluster=ceph --admin-daemon
 /var/run/ceph/ceph-mon.node30.asok mon_status
 [node30][DEBUG ]

 


 [node30][DEBUG ] status for monitor: mon.node30
 [node30][DEBUG ] {
 [node30][DEBUG ]   "election_epoch": 0,
 [node30][DEBUG ]   "extra_probe_peers": [
 [node30][DEBUG ] "192.168.1.115:6789/0
 ",
 [node30][DEBUG ] "192.168.1.130:6789/0
 ",
 [node30][DEBUG ] "192.168.1.172:6789/0
 "
 [node30][DEBUG ]   ],
 [node30][DEBUG ]   "monmap": {
 [node30][DEBUG ] "created": "0.00",
 [node30][DEBUG ] "epoch": 0,
 [node30][DEBUG ] "fsid": "0d00a742-7ac1-4535-b0dc-26f5a0fe7924",
 [node30][DEBUG ] "modified": "0.00",
 [node30][DEBUG ] "mons": [
 [node30][DEBUG ]   {
 [node30][DEBUG ] "addr": "192.168.1.173:6789/0
 ",
 [node30][DEBUG ] "name": "node30",
 [node30][DEBUG ] "rank": 0
 [node30][DEBUG ]   },
 [node30][DEBUG ]   {
 [node30][DEBUG ] "addr": "0.0.0.0:0/1 ",
 [node30][DEBUG ] "name": "node31",
 [node30][DEBUG ] "rank": 1
 [node30][DEBUG ]   },
 [node30][DEBUG ]   {
 [node30][DEBUG ] "addr": "0.0.0.0:0/2 ",
 [node30][DEBUG ] "name": "node32",
 [node30][DEBUG ] "rank": 2
 [node30][DEBUG ]   }
 [node30][DEBUG ] ]
 [node30][DEBUG ]   },
 [node30][DEBUG ]   "name": "node30",
 [node30][DEBUG ]   "outside_quorum": [
 [node30][DEBUG ] "node30"
 [node30][DEBUG ]   ],
 [node30][DEBUG ]   "quorum": [],
 [node30][DEBUG ]   "rank": 0,
 [node30][DEBUG ]   "state": "probing",
 [node30][DEBUG ]   "sync_provider": []
 [node30][DEBUG ] }


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>
>>>
>>
>>
>
>
> --
> Wido den Hollander
> 42on B.V.
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.75 released

2014-01-17 Thread Peter Matulis
On 01/14/2014 10:42 PM, Sage Weil wrote:
> This is a big release, with lots of infrastructure going in for
> firefly.  The big items include a prototype standalone frontend for
> radosgw (which does not require apache or fastcgi), tracking for read
> activity on the osds (to inform tiering decisions), preliminary cache
> pool support (no snapshots yet), and lots of bug fixes and other work
> across the tree to get ready for the next batch of erasure coding
> patches.

...

> * The default CRUSH rules and layouts are now using the latest and
>   greatest tunables and defaults.  Clusters using the old values will
>   now present with a health WARN state.  This can be disabled by
>   adding 'mon warn on legacy crush tunables = false' to ceph.conf.

So for upgraded clusters how does one partake in those latest and
greatest tunables & defaults?

/pm
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cepfs: Minimal deployment

2014-01-17 Thread Gregory Farnum
On Friday, January 17, 2014, Iban Cabrillo  wrote:

> Dear,
>   we are studying the possibility to migrate our FS in the next year to
> cephfs. I know that it is not prepare for production environments yet, but
> we are planning to play with it in the next months deploying a basic
> testbed.
>   Reading the documentation, I see 3 mons, 1 mds and several ods's (both
> in physical machines..I have understood). Is this true?
>   On the other hand I do not understand the fail-over mechanism for
> clients when have mounted a FS, looking at documentation :
>
>*ceph-fuse* [ -m *monaddr*:*port* ] *mountpoint* [ *fuse options* ]
> You have to specify (hardcode) the "*monaddr*:*port", if this mon (ip) is
> down, what happen, Do you lost the fs on that node?, or there is a generic
> dns-rrd implementation for mons??*
>

You can actually specify a list of mons, and once connected to any mon the
client fetches the full list and will reconnect to them if the one it is
currently talking to goes down.



>
> *  Is there any implementation for "tiering" or "HSM" at software level, I
> mean, can I mix different type of disk (ssds and SATA) on diferent pools,
> and migrate data between them automatically (most used, size, last time
> access) *
>

Sadly no. We're starting to play in this space in our upcoming Firefly
release with "cache pools", but it's not quite done yet.
-Greg


>
>
>
> *Please could anyone clarify to me this point?*
>
> *Regards, I*
>
> --
> 
> Iban Cabrillo Bartolome
> Instituto de Fisica de Cantabria (IFCA)
> Santander, Spain
> Tel: +34942200969
> 
> Bertrand Russell:
> *"El problema con el mundo es que los estúpidos están seguros de todo y
> los inteligentes están llenos de dudas*"
>


-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph cluster is unreachable because of authentication failure

2014-01-17 Thread Sage Weil
On Fri, 17 Jan 2014, Guang wrote:
> Thanks Sage.
> 
> I further narrow down the problem to #any command using paxos service would 
> hang#, following are details:
> 
> 1. I am able to run ceph status / osd dump, etc., however, the result are out 
> of date (though I stopped all OSDs, it does not reflect in ceph status 
> report).
> 
> -bash-4.1$ sudo ceph -s
>   cluster b9cb3ea9-e1de-48b4-9e86-6921e2c537d2
>health HEALTH_WARN 2797 pgs degraded; 107 pgs down; 7503 pgs peering; 917 
> pgs recovering; 6079 pgs recovery_wait; 2957 pgs stale; 7771 pgs stuck 
> inactive; 2957 pgs stuck stale; 16567 pgs stuck unclean; recovery 
> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%); 2 near 
> full osd(s); 57/751 in osds are down; 
> noout,nobackfill,norecover,noscrub,nodeep-scrub flag(s) set
>monmap e1: 3 mons at 
> {osd151=10.194.0.68:6789/0,osd152=10.193.207.130:6789/0,osd153=10.193.207.131:6789/0},
>  election epoch 123278, quorum 0,1,2 osd151,osd152,osd153
>osdmap e134893: 781 osds: 694 up, 751 in
> pgmap v2388518: 22203 pgs: 26 inactive, 14 active, 79 
> stale+active+recovering, 5020 active+clean, 242 stale, 4352 
> active+recovery_wait, 616 stale+active+clean, 177 active+recovering+degraded, 
> 6714 peering, 925 stale+active+recovery_wait, 86 down+peering, 1547 
> active+degraded, 32 stale+active+recovering+degraded, 648 stale+peering, 21 
> stale+down+peering, 239 stale+active+degraded, 651 
> active+recovery_wait+degraded, 30 remapped+peering, 151 
> stale+active+recovery_wait+degraded, 4 stale+remapped+peering, 629 
> active+recovering; 79656 GB data, 363 TB used, 697 TB / 1061 TB avail; 
> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%)
>mdsmap e1: 0/0/1 up
> 
> 2. If I run a command which uses paxos, the command will hang forever, this 
> includes, ceph osd set noup (and also including those commands osd send to 
> monitor when being started (create-or-add)).
> 
> I attached the corresponding monitor log (it is like a bug).

I see the osd set command coming through, but it arrives while paxos is 
converging and the log seems to end before the mon would normally process 
te delayed messages.  Is there a reason why the log fragment you attached 
ends there, or did the process hang or something?

Thanks-
sage

> I 
> 
> On Jan 17, 2014, at 1:35 AM, Sage Weil  wrote:
> 
> > Hi Guang,
> > 
> > On Thu, 16 Jan 2014, Guang wrote:
> >> I still have bad the luck to figure out what is the problem making 
> >> authentication failure, so in order to get the cluster back, I tried:
> >>  1. stop all daemons (mon & osd)
> >>  2. change the configuration to disable cephx
> >>  3. start mon daemons (3 in total)
> >>  4. start osd daemon one by one
> >> 
> >> After finishing step 3, the cluster can be reachable ('ceph -s' give 
> >> results):
> >> -bash-4.1$ sudo ceph -s
> >>  cluster b9cb3ea9-e1de-48b4-9e86-6921e2c537d2
> >>   health HEALTH_WARN 2797 pgs degraded; 107 pgs down; 7503 pgs peering; 
> >> 917 pgs recovering; 6079 pgs recovery_wait; 2957 pgs stale; 7771 pgs stuck 
> >> inactive; 2957 pgs stuck stale; 16567 pgs stuck unclean; recovery 
> >> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%); 2 near 
> >> full osd(s); 57/751 in osds are down; 
> >> noout,nobackfill,norecover,noscrub,nodeep-scrub flag(s) set
> >>   monmap e1: 3 mons at 
> >> {osd151=10.194.0.68:6789/0,osd152=10.193.207.130:6789/0,osd153=10.193.207.131:6789/0},
> >>  election epoch 106022, quorum 0,1,2 osd151,osd152,osd153
> >>   osdmap e134893: 781 osds: 694 up, 751 in
> >>pgmap v2388518: 22203 pgs: 26 inactive, 14 active, 79 
> >> stale+active+recovering, 5020 active+clean, 242 stale, 4352 
> >> active+recovery_wait, 616 stale+active+clean, 177 
> >> active+recovering+degraded, 6714 peering, 925 stale+active+recovery_wait, 
> >> 86 down+peering, 1547 active+degraded, 32 
> >> stale+active+recovering+degraded, 648 stale+peering, 21 
> >> stale+down+peering, 239 stale+active+degraded, 651 
> >> active+recovery_wait+degraded, 30 remapped+peering, 151 
> >> stale+active+recovery_wait+degraded, 4 stale+remapped+peering, 629 
> >> active+recovering; 79656 GB data, 363 TB used, 697 TB / 1061 TB avail; 
> >> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%)
> >>   mdsmap e1: 0/0/1 up
> >> (at this point, all OSDs should be down).
> >> 
> >> When I tried to start OSD daemon, the starting script got hang, and the 
> >> process hang is:
> >> root  80497  80496  0 08:18 pts/000:00:00 python /usr/bin/ceph 
> >> --name=osd.22 --keyring=/var/lib/ceph/osd/ceph-22/keyring osd crush 
> >> create-or-move -- 22 0.40 root=default host=osd173
> >> 
> >> When I strace the starting script, I got the following traces (process 
> >> 75873 is the above process), it failed with futex and then do a infinite 
> >> loop:
> >>   select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
> >> Any idea what might trigger this?
> > 
> > It is hard to tell from the strace what is going o

Re: [ceph-users] Ceph / Dell hardware recommendation

2014-01-17 Thread Shain Miley
Just an FYI...we have a Ceph cluster setup for archiving audio and video using 
the following Dell hardware:

6 x Dell R-720xd;64 GB of RAM; for OSD nodes
72 x 4TB SAS drives as OSD’s
3 x Dell R-420;32 GB of RAM; for MON/RADOSGW/MDS nodes
2 x Force10 S4810 switches
4 x 10 GigE LCAP bonded Intel cards

This provides us with about 260 TB of usable space. With rados bench we are 
able to get the following on some the pools we tested:

1 replica - 1175 MB/s
2 replicas - 850 MB/s
3 replicas - 625 MB/s

If we decide to build a second cluster in the future for rbd backed vm's, we 
will either be looking into the new ceph 'ssd tiering' options, or a little bit 
less dense Dell nodes for osd's using ssd's for the journals, in order to 
maximize performance.

Shain


Shain Miley | Manager of Systems and Infrastructure, Digital Media | 
smi...@npr.org | 202.513.3649


From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com] on 
behalf of Lincoln Bryant [linco...@uchicago.edu]
Sent: Thursday, January 16, 2014 1:10 PM
To: Cedric Lemarchand
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph / Dell hardware recommendation

For our ~400 TB Ceph deployment, we bought:
(2) R720s w/ dual X5660s and 96 GB of RAM
(1) 10Gb NIC (2 interfaces per card)
(4) MD1200s per machine
...and a boat load of 4TB disks!

In retrospect, I would almost certainly would have gotten more servers. During 
heavy writes we see the load spiking up to ~50 on Emperor and warnings about 
slow OSDs, but we clearly seem to be on the extreme with something like 60 OSDs 
per box :)

Cheers,
Lincoln

On Jan 16, 2014, at 4:09 AM, Cedric Lemarchand wrote:

>
> Le 16/01/2014 10:16, NEVEU Stephane a écrit :
>> Thank you all for comments,
>>
>> So to sum up a bit, it's a reasonable compromise to buy :
>> 2 x R720 with 2x Intel E5-2660v2, 2.2GHz, 25M Cache, 48Gb RAM, 2 x 146GB, 
>> SAS 6Gbps, 2.5-in, 15K RPM Hard Drive (Hot-plug) Flex Bay for OS and 24 x 
>> 1.2TB, SAS 6Gbps, 2.5in, 10K RPM Hard Drive for OSDs (journal located on 
>> each osd) and PERC H710p Integrated RAID Controller, 1GB NV Cache
>> ?
>> Or is it a better idea to buy 4 servers less powerful instead of 2 ?
> I think you are facing the well known trade off between 
> price/performances/usable storage size.
>
> More servers less powerfull will give you better power computation and better 
> iops by usable To, but will be more expensive. An extrapolation of that that 
> would be to use a blade for each To => very powerful/very expensive.
>
>
> The choice really depend of the work load you need to handle, witch is not an 
> easy thing to estimate.
>
> Cheers
>
> --
> Cédric
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph / Dell hardware recommendation

2014-01-17 Thread Ahmed Kamal
Thanks for the numbers Shain. I'm new to ceph, I definitely like the
technology. However I'm not sure how to calculate if the transfer numbers
you mentioned would be considered "good". For example, assuming a single
disk's rate is barely 50MB/s .. Then the 1175MB/s is merely the aggregate
bandwidth of 24 disks. Since ceph writes twice for journaling, I'm willing
to accept we're effectively utilizing 48 drives. This is still only two
thirds of the available 72 disk bandwidth .. I'd like to better understand
why we're seeing such numbers, and if they are typical/good. Thanks!


On Fri, Jan 17, 2014 at 7:03 PM, Shain Miley  wrote:

> Just an FYI...we have a Ceph cluster setup for archiving audio and video
> using the following Dell hardware:
>
> 6 x Dell R-720xd;64 GB of RAM; for OSD nodes
> 72 x 4TB SAS drives as OSD’s
> 3 x Dell R-420;32 GB of RAM; for MON/RADOSGW/MDS nodes
> 2 x Force10 S4810 switches
> 4 x 10 GigE LCAP bonded Intel cards
>
> This provides us with about 260 TB of usable space. With rados bench we
> are able to get the following on some the pools we tested:
>
> 1 replica - 1175 MB/s
> 2 replicas - 850 MB/s
> 3 replicas - 625 MB/s
>
> If we decide to build a second cluster in the future for rbd backed vm's,
> we will either be looking into the new ceph 'ssd tiering' options, or a
> little bit less dense Dell nodes for osd's using ssd's for the journals, in
> order to maximize performance.
>
> Shain
>
>
> Shain Miley | Manager of Systems and Infrastructure, Digital Media |
> smi...@npr.org | 202.513.3649
>
> 
> From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com]
> on behalf of Lincoln Bryant [linco...@uchicago.edu]
> Sent: Thursday, January 16, 2014 1:10 PM
> To: Cedric Lemarchand
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Ceph / Dell hardware recommendation
>
> For our ~400 TB Ceph deployment, we bought:
> (2) R720s w/ dual X5660s and 96 GB of RAM
> (1) 10Gb NIC (2 interfaces per card)
> (4) MD1200s per machine
> ...and a boat load of 4TB disks!
>
> In retrospect, I would almost certainly would have gotten more servers.
> During heavy writes we see the load spiking up to ~50 on Emperor and
> warnings about slow OSDs, but we clearly seem to be on the extreme with
> something like 60 OSDs per box :)
>
> Cheers,
> Lincoln
>
> On Jan 16, 2014, at 4:09 AM, Cedric Lemarchand wrote:
>
> >
> > Le 16/01/2014 10:16, NEVEU Stephane a écrit :
> >> Thank you all for comments,
> >>
> >> So to sum up a bit, it's a reasonable compromise to buy :
> >> 2 x R720 with 2x Intel E5-2660v2, 2.2GHz, 25M Cache, 48Gb RAM, 2 x
> 146GB, SAS 6Gbps, 2.5-in, 15K RPM Hard Drive (Hot-plug) Flex Bay for OS and
> 24 x 1.2TB, SAS 6Gbps, 2.5in, 10K RPM Hard Drive for OSDs (journal located
> on each osd) and PERC H710p Integrated RAID Controller, 1GB NV Cache
> >> ?
> >> Or is it a better idea to buy 4 servers less powerful instead of 2 ?
> > I think you are facing the well known trade off between
> price/performances/usable storage size.
> >
> > More servers less powerfull will give you better power computation and
> better iops by usable To, but will be more expensive. An extrapolation of
> that that would be to use a blade for each To => very powerful/very
> expensive.
> >
> >
> > The choice really depend of the work load you need to handle, witch is
> not an easy thing to estimate.
> >
> > Cheers
> >
> > --
> > Cédric
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph / Dell hardware recommendation

2014-01-17 Thread Thomas Johnson
I guess I joined the mailing list at just the right time, since I'm just 
starting to size out a ceph cluster, and I was just starting to read about how 
best to size out the nodes.

You mention consider less dense nodes for OSD nodes

Assuming you used nodes with similar CPU,RAM, etc, at what point do you think 
you hit the 'sweet spot'? Would you do 6 drives per osd? 4?

My first thought were some 4-bay servers (similar to what you described from a 
CPU/RAM standpoint), and putting 3x4TB stat drives in each, and one SSD for the 
journal).  But then I was wondering if a higher drive count per-chassis might 
be a better choice, hence my question above.

And then just to make it interesting...

Another thing I'm considering is that I have 20 or so servers that do various 
tasks, but which aren't heavily loaded. They are small 1U units, though, but I 
have one open sata in each - I could just drop a drive into each one and make 
each an OSD node, and really spread things out.  But is that better than 
building a ceph-specific cluster?  I don't have the faintest idea 
yetanybody out there compared these options? Any thoughts?

Tom


On Jan 17, 2014, at 9:03 AM, Shain Miley  wrote:

> Just an FYI...we have a Ceph cluster setup for archiving audio and video 
> using the following Dell hardware:
> 
> 6 x Dell R-720xd;64 GB of RAM; for OSD nodes
> 72 x 4TB SAS drives as OSD’s
> 3 x Dell R-420;32 GB of RAM; for MON/RADOSGW/MDS nodes
> 2 x Force10 S4810 switches
> 4 x 10 GigE LCAP bonded Intel cards
> 
> This provides us with about 260 TB of usable space. With rados bench we are 
> able to get the following on some the pools we tested:
> 
> 1 replica - 1175 MB/s
> 2 replicas - 850 MB/s
> 3 replicas - 625 MB/s
> 
> If we decide to build a second cluster in the future for rbd backed vm's, we 
> will either be looking into the new ceph 'ssd tiering' options, or a little 
> bit less dense Dell nodes for osd's using ssd's for the journals, in order to 
> maximize performance.
> 
> Shain
> 
> 
> Shain Miley | Manager of Systems and Infrastructure, Digital Media | 
> smi...@npr.org | 202.513.3649
> 
> 
> From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com] 
> on behalf of Lincoln Bryant [linco...@uchicago.edu]
> Sent: Thursday, January 16, 2014 1:10 PM
> To: Cedric Lemarchand
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Ceph / Dell hardware recommendation
> 
> For our ~400 TB Ceph deployment, we bought:
>(2) R720s w/ dual X5660s and 96 GB of RAM
>(1) 10Gb NIC (2 interfaces per card)
>(4) MD1200s per machine
>...and a boat load of 4TB disks!
> 
> In retrospect, I would almost certainly would have gotten more servers. 
> During heavy writes we see the load spiking up to ~50 on Emperor and warnings 
> about slow OSDs, but we clearly seem to be on the extreme with something like 
> 60 OSDs per box :)
> 
> Cheers,
> Lincoln
> 
> On Jan 16, 2014, at 4:09 AM, Cedric Lemarchand wrote:
> 
>> 
>> Le 16/01/2014 10:16, NEVEU Stephane a écrit :
>>> Thank you all for comments,
>>> 
>>> So to sum up a bit, it's a reasonable compromise to buy :
>>> 2 x R720 with 2x Intel E5-2660v2, 2.2GHz, 25M Cache, 48Gb RAM, 2 x 146GB, 
>>> SAS 6Gbps, 2.5-in, 15K RPM Hard Drive (Hot-plug) Flex Bay for OS and 24 x 
>>> 1.2TB, SAS 6Gbps, 2.5in, 10K RPM Hard Drive for OSDs (journal located on 
>>> each osd) and PERC H710p Integrated RAID Controller, 1GB NV Cache
>>> ?
>>> Or is it a better idea to buy 4 servers less powerful instead of 2 ?
>> I think you are facing the well known trade off between 
>> price/performances/usable storage size.
>> 
>> More servers less powerfull will give you better power computation and 
>> better iops by usable To, but will be more expensive. An extrapolation of 
>> that that would be to use a blade for each To => very powerful/very 
>> expensive.
>> 
>> 
>> The choice really depend of the work load you need to handle, witch is not 
>> an easy thing to estimate.
>> 
>> Cheers
>> 
>> --
>> Cédric
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cepfs: Minimal deployment

2014-01-17 Thread Iban Cabrillo
Hi Greg,


2014/1/17 Gregory Farnum 

> On Friday, January 17, 2014, Iban Cabrillo 
> wrote:
>
>> Dear,
>>   we are studying the possibility to migrate our FS in the next year to
>> cephfs. I know that it is not prepare for production environments yet, but
>> we are planning to play with it in the next months deploying a basic
>> testbed.
>>   Reading the documentation, I see 3 mons, 1 mds and several ods's (both
>> in physical machines..I have understood). Is this true?
>>   On the other hand I do not understand the fail-over mechanism for
>> clients when have mounted a FS, looking at documentation :
>>
>>*ceph-fuse* [ -m *monaddr*:*port* ] *mountpoint* [ *fuse options* ]
>> You have to specify (hardcode) the "*monaddr*:*port", if this mon (ip)
>> is down, what happen, Do you lost the fs on that node?, or there is a
>> generic dns-rrd implementation for mons??*
>>
>
> You can actually specify a list of mons, and once connected to any mon the
> client fetches the full list and will reconnect to them if the one it is
> currently talking to goes down.
>
> OK, thanks for the info.

>
>
>>
>> *  Is there any implementation for "tiering" or "HSM" at software level,
>> I mean, can I mix different type of disk (ssds and SATA) on diferent pools,
>> and migrate data between them automatically (most used, size, last time
>> access) *
>>
>
> Sadly no. We're starting to play in this space in our upcoming Firefly
> release with "cache pools", but it's not quite done yet.
> -Greg
>
> Is good to heard that this features are in developer's mind.  I will try a
minimal  installation understanding how ceph works.

>
>
>>
>>
>>
>> *Please could anyone clarify to me this point?*
>>
>> *Regards, I*
>>
>> --
>> 
>> Iban Cabrillo Bartolome
>> Instituto de Fisica de Cantabria (IFCA)
>> Santander, Spain
>> Tel: +34942200969
>> 
>> Bertrand Russell:
>> *"El problema con el mundo es que los estúpidos están seguros de todo y
>> los inteligentes están llenos de dudas*"
>>
>
>
> --
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>

Cheers, I
Bertrand Russell:
*"El problema con el mundo es que los estúpidos están seguros de todo y los
inteligentes están llenos de dudas*"
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw s3 api for java.

2014-01-17 Thread raj kumar
I tried using aws-java-sdk. I can list the buckets, but cant do any other
functions like create/delete objects/buckets.  getting 403/405 response
code.  pls let me know if anybody used it. sub domains are resolving
properly in dns.  Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.75 released

2014-01-17 Thread Alexandre Oliva
On Jan 15, 2014, Sage Weil  wrote:

>  v0.75 291 files changed, 82713 insertions(+), 33495 deletions(-)

> Upgrading
> ~

I suggest adding:

  * All (replicated?) pools will likely fail scrubbing because the
per-pool dirty object counts, introduced in 0.75, won't match.  This
inconsistency is cleared by a pg repair; unfortunately this is about
as expensive as a a deep-scrub, and it's not automatically scheduled
or retried, like scrubs and deep-scrubs.

I suppose after the dirty counts are brought to sync, the next scrub
won't find inconsistent counts again, but I haven't got to that point
yet.

What surprised me was the huge number of objects marked as dirty!  It
was at least 14k out of 70k objects in each data pool, and even more in
metadata pools, but it's not like I have messed with this many objects
recently.  Could something be amiss there?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist Red Hat Brazil Toolchain Engineer
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.75 released

2014-01-17 Thread Mark Kirkwood

On 18/01/14 19:50, Alexandre Oliva wrote:

On Jan 15, 2014, Sage Weil  wrote:


  v0.75 291 files changed, 82713 insertions(+), 33495 deletions(-)



Upgrading
~


I suggest adding:

   * All (replicated?) pools will likely fail scrubbing because the
 per-pool dirty object counts, introduced in 0.75, won't match.  This
 inconsistency is cleared by a pg repair; unfortunately this is about
 as expensive as a a deep-scrub, and it's not automatically scheduled
 or retried, like scrubs and deep-scrubs.

I suppose after the dirty counts are brought to sync, the next scrub
won't find inconsistent counts again, but I haven't got to that point
yet.

What surprised me was the huge number of objects marked as dirty!  It
was at least 14k out of 70k objects in each data pool, and even more in
metadata pools, but it's not like I have messed with this many objects
recently.  Could something be amiss there?



And stat mis-matches too I think are going to require folk to run repairs.

Regards

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com