[ceph-users] heterogeneous set of storage disks as a single storage

2014-09-08 Thread pragya jain
hi all! I have a very low level query. Please help to clarify it. To store data on a storage cluster, at the bottom, there is a heterogeneous set of storage disks,in which there can be a variety of storage disks, such as SSDs, HDDs, flash drives, tapes and any other type also. Document says tha

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread Christian Balzer
Hello, On Tue, 09 Sep 2014 01:25:17 -0400 JR wrote: > Greetings > > After running for a couple of hours, my attempt to re-balance a near ful > disk has stopped with a stuck unclean error: > Which is exactly what I warned you about below and what you should have also taken away from fully readi

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread JR
Greetings After running for a couple of hours, my attempt to re-balance a near ful disk has stopped with a stuck unclean error: root@osd45:~# ceph -s cluster c8122868-27af-11e4-b570-52540004010f health HEALTH_WARN 6 pgs backfilling; 6 pgs stuck unclean; recovery 13086/1158268 degraded (1.130

Re: [ceph-users] 回复: mix ceph verion with 0.80.5 and 0.85

2014-09-08 Thread 廖建锋
there is nothing about this in ceph.com 发件人: Jason King 发送时间: 2014-09-09 11:19 收件人: 廖建锋 抄送: ceph-users; ceph-users 主题: Re: [ceph-users] 回复: mix ceph verion with 0.80.5 an

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-08 Thread Sage Weil
On Sun, 7 Sep 2014, Haomai Wang wrote: > I have found the root cause. It's a bug. > > When chunky scrub happen, it will iterate the who pg's objects and > each iterator only a few objects will be scan. > > osd/PG.cc:3758 > ret = get_pgbackend()-> objects_list_partial( > start, >

Re: [ceph-users] all my osds are down, but ceph -s tells they are up and in.

2014-09-08 Thread Sage Weil
On Tue, 9 Sep 2014, yuelongguang wrote: > hi,all >   > that is crazy. > 1. > all my osds are down, but ceph -s tells they are up and in. why? Peer OSDs normally handle failure detection. If all OSDs are down, there is nobody to report the failures. After 5 or 10 minutes if the OSDs don't report

Re: [ceph-users] 回复: mix ceph verion with 0.80.5 and 0.85

2014-09-08 Thread Jason King
Check the docs. 2014-09-09 11:02 GMT+08:00 廖建锋 : > Looks like it dosn't work, i noticed that 0.85 added superblock to > leveldb osd, the osd which I alread have do not have superblock > is there anybody can tell me how to upgrade OSDs ? > > > > *发件人:* ceph-users > *发送时间:* 2014-09-09 10:32 >

[ceph-users] 回复: mix ceph verion with 0.80.5 and 0.85

2014-09-08 Thread 廖建锋
Looks like it dosn't work, i noticed that 0.85 added superblock to leveldb osd, the osd which I alread have do not have superblock is there anybody can tell me how to upgrade OSDs ? 发件人: ceph-users 发送时间: 2014-09-09 10:32 收件人: ceph-users

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread JR
Hi Christian, Ha ... root@osd45:~# ceph osd pool get rbd pg_num pg_num: 128 root@osd45:~# ceph osd pool get rbd pgp_num pgp_num: 64 That's the explanation! I did run the command but it spit out some (what I thought was a harmless) warning; should have checked more carefully. I now have the exp

Re: [ceph-users] SSD journal deployment experiences

2014-09-08 Thread Christian Balzer
On Tue, 9 Sep 2014 01:40:42 + Quenten Grasso wrote: > This reminds me of something I was trying to find out awhile back. > > If we have 2000 "Random" IOPS of which are 4K Blocks our cluster > (assuming 3 x Replicas) will generate 6000 IOPS @ 4K onto the journals. > > Does this mean our Journ

[ceph-users] mix ceph verion with 0.80.5 and 0.85

2014-09-08 Thread 廖建锋
dear, As there are a lot of bugs of keyvalue backend in 0.80.5 firely version , So i want to upgrade to 0.85 for some osds which already down and unable to start and keep some other osd with 0.80.5,I wondering , will it works? [Adobe Systems] 廖建锋 Derek 运维经

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread Christian Balzer
Hello, On Mon, 08 Sep 2014 18:30:07 -0400 JR wrote: > Hi Christian, all, > > Having researched this a bit more, it seemed that just doing > > ceph osd pool set rbd pg_num 128 > ceph osd pool set rbd pgp_num 128 > > might be the answer. Alas, it was not. After running the above the > cluster

[ceph-users] all my osds are down, but ceph -s tells they are up and in.

2014-09-08 Thread yuelongguang
hi,all that is crazy. 1. all my osds are down, but ceph -s tells they are up and in. why? 2. now all osds are down, a vm is using rbd as its disk, and inside vm fio is r/wing the disk , but it hang ,can not be killed. why ? thanks [root@cephosd2-monb ~]# ceph -v ceph version 0.81 (8de9501df

Re: [ceph-users] SSD journal deployment experiences

2014-09-08 Thread Quenten Grasso
This reminds me of something I was trying to find out awhile back. If we have 2000 "Random" IOPS of which are 4K Blocks our cluster (assuming 3 x Replicas) will generate 6000 IOPS @ 4K onto the journals. Does this mean our Journals will absorb 6000 IOPS and turn these into X IOPS onto our spin

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread Christian Balzer
Hello, On Mon, 08 Sep 2014 13:50:08 -0400 JR wrote: > Hi Christian, > > I have 448 PGs and 448 PGPs (according to ceph -s). > > This seems borne out by: > > root@osd45:~# rados lspools > data > metadata > rbd > volumes > images > root@osd45:~# for i in $(rados lspools); do echo "$i pg($(ceph

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread Christian Balzer
Hello, On Mon, 08 Sep 2014 10:08:27 -0700 JIten Shah wrote: > While checking the health of the cluster, I ran to the following error: > > warning: health HEALTH_WARN too few pgs per osd (1< min 20) > > When I checked the pg and php numbers, I saw the value was the default > value of 64 > > ce

Re: [ceph-users] resizing the OSD

2014-09-08 Thread Christian Balzer
Hello, On Mon, 08 Sep 2014 09:53:58 -0700 JIten Shah wrote: > > On Sep 6, 2014, at 8:22 PM, Christian Balzer wrote: > > > > > Hello, > > > > On Sat, 06 Sep 2014 10:28:19 -0700 JIten Shah wrote: > > > >> Thanks Christian. Replies inline. > >> On Sep 6, 2014, at 8:04 AM, Christian Balzer w

Re: [ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Somnath Roy
Yeah!!..Looks similar but not entirely.. There is another potential race condition that may cause this. We are protecting the TrackedOp::events structure only during TrackedOp::mark_event with lock mutex. I couldn't find it anywhere else. The events structure should also be protected during dump

Re: [ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Sage Weil
On Tue, 9 Sep 2014, Somnath Roy wrote: > Created the following tracker and assigned to me. > > http://tracker.ceph.com/issues/9384 By the way, this might be the same as or similar to http://tracker.ceph.com/issues/8885 Thanks! sage > > Thanks & Regards > Somnath > > -Original Message

Re: [ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Somnath Roy
Created the following tracker and assigned to me. http://tracker.ceph.com/issues/9384 Thanks & Regards Somnath -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: Monday, September 08, 2014 5:22 PM To: Somnath Roy Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.ker

Re: [ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Samuel Just
That seems reasonable. Bug away! -Sam On Mon, Sep 8, 2014 at 5:11 PM, Somnath Roy wrote: > Hi Sage/Sam, > > > > I faced a crash in OSD with latest Ceph master. Here is the log trace for > the same. > > > > ceph version 0.85-677-gd5777c4 (d5777c421548e7f039bb2c77cb0df2e9c7404723) > > 1: ceph-osd(

[ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Somnath Roy
Hi Sage/Sam, I faced a crash in OSD with latest Ceph master. Here is the log trace for the same. ceph version 0.85-677-gd5777c4 (d5777c421548e7f039bb2c77cb0df2e9c7404723) 1: ceph-osd() [0x990def] 2: (()+0xfbb0) [0x7f72ae6e6bb0] 3: (gsignal()+0x37) [0x7f72acc08f77] 4: (abort()+0x148) [0x7f72acc0c

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread JR
Hi Christian, all, Having researched this a bit more, it seemed that just doing ceph osd pool set rbd pg_num 128 ceph osd pool set rbd pgp_num 128 might be the answer. Alas, it was not. After running the above the cluster just sat there. Finally, reading some more, I ran: ceph osd reweight-b

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz wrote: > Hi Greg, > > Thanks for your support! > > On 08. 09. 14 20:20, Gregory Farnum wrote: > >> The first one is not caused by the same thing as the ticket you >> reference (it was fixed well before emperor), so it appears to be some >> kind o

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Francois Deppierraz
Hi Greg, Thanks for your support! On 08. 09. 14 20:20, Gregory Farnum wrote: > The first one is not caused by the same thing as the ticket you > reference (it was fixed well before emperor), so it appears to be some > kind of disk corruption. > The second one is definitely corruption of some kin

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-08 Thread Sebastien Han
They definitely are Warren! Thanks for bringing this here :). On 05 Sep 2014, at 23:02, Wang, Warren wrote: > +1 to what Cedric said. > > Anything more than a few minutes of heavy sustained writes tended to get our > solid state devices into a state where garbage collection could not keep up.

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 1:42 AM, Francois Deppierraz wrote: > Hi, > > This issue is on a small 2 servers (44 osds) ceph cluster running 0.72.2 > under Ubuntu 12.04. The cluster was filling up (a few osds near full) > and I tried to increase the number of pg per pool to 1024 for each of > the 14 poo

Re: [ceph-users] Delays while waiting_for_osdmap according to dump_historic_ops

2014-09-08 Thread Gregory Farnum
On Sun, Sep 7, 2014 at 4:28 PM, Alex Moore wrote: > I recently found out about the "ceph --admin-daemon > /var/run/ceph/ceph-osd..asok dump_historic_ops" command, and noticed > something unexpected in the output on my cluster, after checking numerous > output samples... > > It looks to me like "no

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread JIten Shah
Thanks. How do I query the OSDMap on monitors? Using "ceph osd pool get data pg” ? or is there a way to get the full list of settings? —jiten On Sep 8, 2014, at 10:52 AM, Gregory Farnum wrote: > It's stored in the OSDMap on the monitors. > Software Engineer #42 @ http://inktank.com | http:/

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread Gregory Farnum
It's stored in the OSDMap on the monitors. Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 8, 2014 at 10:50 AM, JIten Shah wrote: > So, if it doesn’t refer to the entry in ceph.conf. Where does it actually > store the new value? > > —Jiten > > On Sep 8, 2014, at 10:31 A

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread JR
Hi Christian, I have 448 PGs and 448 PGPs (according to ceph -s). This seems borne out by: root@osd45:~# rados lspools data metadata rbd volumes images root@osd45:~# for i in $(rados lspools); do echo "$i pg($(ceph osd pool get $i pg_num), pgp$(ceph osd pool get $i pg_num)"; done data pg(pg_num:

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread JIten Shah
So, if it doesn’t refer to the entry in ceph.conf. Where does it actually store the new value? —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum wrote: > On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah wrote: >> While checking the health of the cluster, I ran to the following error: >> >> warni

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread JIten Shah
Thanks Greg. —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum wrote: > On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah wrote: >> While checking the health of the cluster, I ran to the following error: >> >> warning: health HEALTH_WARN too few pgs per osd (1< min 20) >> >> When I checked the pg

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah wrote: > While checking the health of the cluster, I ran to the following error: > > warning: health HEALTH_WARN too few pgs per osd (1< min 20) > > When I checked the pg and php numbers, I saw the value was the default value > of 64 > > ceph osd pool ge

[ceph-users] Updating the pg and pgp values

2014-09-08 Thread JIten Shah
While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1< min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num:

Re: [ceph-users] resizing the OSD

2014-09-08 Thread JIten Shah
On Sep 6, 2014, at 8:22 PM, Christian Balzer wrote: > > Hello, > > On Sat, 06 Sep 2014 10:28:19 -0700 JIten Shah wrote: > >> Thanks Christian. Replies inline. >> On Sep 6, 2014, at 8:04 AM, Christian Balzer wrote: >> >>> >>> Hello, >>> >>> On Fri, 05 Sep 2014 15:31:01 -0700 JIten Shah wr

Re: [ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread Christian Balzer
Hello, On Mon, 08 Sep 2014 11:42:59 -0400 JR wrote: > Greetings all, > > I have a small ceph cluster (4 nodes, 2 osds per node) which recently > started showing: > > root@ocd45:~# ceph health > HEALTH_WARN 1 near full osd(s) > > admin@node4:~$ for i in 2 3 4 5; do sudo ssh osd4$i df -h |egrep

Re: [ceph-users] Ceph object back up details

2014-09-08 Thread Yehuda Sadeh
Not sure I understand what you ask. Multiple zones within the same region configuration is described here: http://ceph.com/docs/master/radosgw/federated-config/#multi-site-data-replication Yehuda On Sun, Sep 7, 2014 at 10:32 PM, M Ranga Swami Reddy wrote: > Hi Yahuda, > I need more info on Ceph

[ceph-users] Is ceph osd reweight always safe to use?

2014-09-08 Thread JR
Greetings all, I have a small ceph cluster (4 nodes, 2 osds per node) which recently started showing: root@ocd45:~# ceph health HEALTH_WARN 1 near full osd(s) admin@node4:~$ for i in 2 3 4 5; do sudo ssh osd4$i df -h |egrep 'Filesystem|osd/ceph'; done Filesystem Size Used Avail Use% Mounte

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-08 Thread Haomai Wang
I'm not very sure, it's possible that keyvaluestore will use spare write which make big difference with ceph space statistic On Mon, Sep 8, 2014 at 6:35 PM, Kenneth Waegeman wrote: > > Thank you very much ! > > Is this problem then related to the weird sizes I see: > pgmap v55220: 1216 pgs,

Re: [ceph-users] Ceph on RHEL 7 with multiple OSD's

2014-09-08 Thread BG
Also, for info, this is from the osd.0 log file: 2014-09-08 11:06:44.000663 7f41144c7700 0 log [WRN] : map e10 wrongly marked me down 2014-09-08 11:06:44.002595 7f41144c7700 0 osd.0 10 crush map has features 1107558400, adjusting msgr requires for mons 2014-09-08 11:06:44.003346 7f41072ab700 0

Re: [ceph-users] Ceph on RHEL 7 with multiple OSD's

2014-09-08 Thread BG
Sorry, no idea, this is a first time install I'm trying and I'm following the "Storage Cluster Quick Start" guide. Looking in the "ceph.log" file I do see warnings related to osd.0: 2014-09-08 11:06:44.000667 osd.0 10.119.16.15:6800/4433 1 : [WRN] map e10 wrongly marked me down I've also just not

Re: [ceph-users] Ceph on RHEL 7 with multiple OSD's

2014-09-08 Thread Loic Dachary
Hi, It it looks like your osd.0 is down and you only have one osd left (osd.1) which would explain why the cluster cannot get to a healthy state. The "size 2" in "pool 0 'data' replicated size 2 ..." means the pool needs at least two OSDs up to function properly. Do you know why the osd.0 is n

Re: [ceph-users] Ceph on RHEL 7 with multiple OSD's

2014-09-08 Thread BG
Apologies for piggybacking this issue but I appear to have a similar problem with Firefly on a CentOS 7 install, thought it better to add it here rather than start a new thread. $ ceph --version ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6) $ ceph health HEALTH_WARN 96 pgs degra

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-08 Thread Kenneth Waegeman
Thank you very much ! Is this problem then related to the weird sizes I see: pgmap v55220: 1216 pgs, 3 pools, 3406 GB data, 852 kobjects 418 GB used, 88130 GB / 88549 GB avail a calculation with df shows indeed that there is about 400GB used on disks, but the tests I ran sho

Re: [ceph-users] I fail to add a monitor in a ceph cluster

2014-09-08 Thread Pascal GREGIS
Oh, I forgot to say: I made a mistake in my first message, the command you suggested to remove was in fact : $ sudo ceph mon add $HOSTNAME $IP and not $ sudo ceph-mon add $HOSTNAME $IP Anyway, removing it makes the whole work. But the doc indicates to execute it. Should the doc be changed? http:

Re: [ceph-users] I fail to add a monitor in a ceph cluster

2014-09-08 Thread Pascal GREGIS
Thanks, it seems to work. I also had to complete the "mon host" line in config file to add my second (and then my third) monitor : mon host = 172.16.1.11,172.16.1.12,172.16.1.13 otherwise it still didn't work when I stoppd 1 of the 3 monitors, I mean when I stopped grenier, the only one which was

Re: [ceph-users] SSD journal deployment experiences

2014-09-08 Thread Dan Van Der Ster
Hi Scott, > On 06 Sep 2014, at 20:39, Scott Laird wrote: > > IOPS are weird things with SSDs. In theory, you'd see 25% of the write IOPS > when writing to a 4-way RAID5 device, since you write to all 4 devices in > parallel. Except that's not actually true--unlike HDs where an IOP is an > I

[ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Francois Deppierraz
Hi, This issue is on a small 2 servers (44 osds) ceph cluster running 0.72.2 under Ubuntu 12.04. The cluster was filling up (a few osds near full) and I tried to increase the number of pg per pool to 1024 for each of the 14 pools to improve storage space balancing. This increase triggered high mem

Re: [ceph-users] Crush Location

2014-09-08 Thread Wido den Hollander
On 09/08/2014 09:57 AM, Jakes John wrote: Hi all, I have been reading ceph-crush-location hook ( http://ceph.com/docs/master/rados/operations/crush-map/ ) which says to add crush location field in the conf file to provide location awareness to ceph deamons and clients. I would like to

[ceph-users] delete performance

2014-09-08 Thread Luis Periquito
Hi, I've been trying to tweak and improve the performance of our ceph cluster. One of the operations that I can't seem to be able to improve much is the delete. From what I've gathered every time there is a delete it goes directly to the HDD, hitting its performance - the op may be recorded in th

[ceph-users] Crush Location

2014-09-08 Thread Jakes John
Hi all, I have been reading ceph-crush-location hook ( http://ceph.com/docs/master/rados/operations/crush-map/ ) which says to add crush location field in the conf file to provide location awareness to ceph deamons and clients. I would like to understand whether ceph use this location awa