[ceph-users] Adding new CEPH monitor keep SYNCHRONIZING

2015-05-18 Thread Ali Hussein

*Hi all*

I have two ceph monitors working fine , i have added them a while ago, 
for now i have added a new Ceph Monitor and it does showing me the 
following log file


2015-05-18 10:54:42.585123 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28280 MB, avail 22894 MB
2015-05-18 10:55:44.418861 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28437 MB, avail 22737 MB
2015-05-18 10:56:45.442884 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28287 MB, avail 22887 MB
2015-05-18 10:57:47.710088 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28449 MB, avail 22725 MB
2015-05-18 10:59:13.436988 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28266 MB, avail 22908 MB
2015-05-18 11:00:15.069245 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28511 MB, avail 22663 MB
2015-05-18 11:01:46.333054 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28285 MB, avail 22889 MB
2015-05-18 11:02:48.268613 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28521 MB, avail 22653 MB
2015-05-18 11:04:21.107442 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28287 MB, avail 22887 MB
2015-05-18 11:05:24.336678 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28552 MB, avail 22622 MB
2015-05-18 11:07:02.355146 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28266 MB, avail 22908 MB
2015-05-18 11:08:04.168761 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28527 MB, avail 22647 MB
2015-05-18 11:09:25.942629 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28296 MB, avail 22878 MB
2015-05-18 11:10:28.410838 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28555 MB, avail 22619 MB
2015-05-18 11:12:06.534287 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28284 MB, avail 22890 MB
2015-05-18 11:13:09.433899 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28337 MB, avail 22837 MB
2015-05-18 11:14:09.485415 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28297 MB, avail 22877 MB
2015-05-18 11:15:13.061472 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28520 MB, avail 22654 MB
2015-05-18 11:16:47.296862 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28288 MB, avail 22886 MB
2015-05-18 11:17:48.454379 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28480 MB, avail 22694 MB
2015-05-18 11:19:24.178109 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28288 MB, avail 22886 MB
2015-05-18 11:20:25.370240 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 44% 
total 51175 MB, used 28536 MB, avail 22638 MB
2015-05-18 11:21:25.424047 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 43% 
total 51175 MB, used 28712 MB, avail 22462 MB
2015-05-18 11:22:27.585467 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 43% 
total 51175 MB, used 28978 MB, avail 22196 MB
2015-05-18 11:23:27.666819 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 42% 
total 51175 MB, used 29197 MB, avail 21977 MB
2015-05-18 11:24:30.853669 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 42% 
total 51175 MB, used 29429 MB, avail 21745 MB
2015-05-18 11:25:32.661380 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 42% 
total 51175 MB, used 29672 MB, avail 21502 MB
2015-05-18 11:26:32.979662 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 41% 
total 51175 MB, used 29716 MB, avail 21458 MB
2015-05-18 11:27:33.285880 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 41% 
total 51175 MB, used 29871 MB, avail 21303 MB
2015-05-18 11:28:36.325777 7f4a9609d700  0 
mon.monitor03@2(synchronizing).data_health(0) update_stats avail 41% 
total 51175 MB, used 30034 MB, avail 21140 MB
2015-05-18 11:29:36.385018 

Re: [ceph-users] new relic ceph plugin

2015-05-18 Thread John Spray
Not that I know of, but if you wanted to repurpose this code it would 
probably be pretty easy:

https://github.com/ceph/Diamond/blob/calamari/src/collectors/ceph/ceph.py

Cheers,
John

On 17/05/2015 23:19, German Anders wrote:

Hi all,

I want to know if someone has deploy some new relic (pyhon) plugin for 
Ceph.


Thanks a lot,

Best regards,

**

*Ger*


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding new CEPH monitor keep SYNCHRONIZING

2015-05-18 Thread Mohamed Pakkeer
Hi Ali,

Which version of Ceph are you using?. Is there any re-spawning osds?

Regards
K.Mohamed Pakkeer

On Mon, May 18, 2015 at 2:23 PM, Ali Hussein <
ali.alkhazr...@earthlinktele.com> wrote:

>  *Hi all*
>
> I have two ceph monitors working fine , i have added them a while ago, for
> now i have added a new Ceph Monitor and it does showing me the following
> log file
>
> 2015-05-18 10:54:42.585123 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28280 MB, avail 22894 MB
> 2015-05-18 10:55:44.418861 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28437 MB, avail 22737 MB
> 2015-05-18 10:56:45.442884 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28287 MB, avail 22887 MB
> 2015-05-18 10:57:47.710088 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28449 MB, avail 22725 MB
> 2015-05-18 10:59:13.436988 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28266 MB, avail 22908 MB
> 2015-05-18 11:00:15.069245 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28511 MB, avail 22663 MB
> 2015-05-18 11:01:46.333054 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28285 MB, avail 22889 MB
> 2015-05-18 11:02:48.268613 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28521 MB, avail 22653 MB
> 2015-05-18 11:04:21.107442 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28287 MB, avail 22887 MB
> 2015-05-18 11:05:24.336678 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28552 MB, avail 22622 MB
> 2015-05-18 11:07:02.355146 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28266 MB, avail 22908 MB
> 2015-05-18 11:08:04.168761 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28527 MB, avail 22647 MB
> 2015-05-18 11:09:25.942629 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28296 MB, avail 22878 MB
> 2015-05-18 11:10:28.410838 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28555 MB, avail 22619 MB
> 2015-05-18 11:12:06.534287 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28284 MB, avail 22890 MB
> 2015-05-18 11:13:09.433899 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28337 MB, avail 22837 MB
> 2015-05-18 11:14:09.485415 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28297 MB, avail 22877 MB
> 2015-05-18 11:15:13.061472 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28520 MB, avail 22654 MB
> 2015-05-18 11:16:47.296862 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28288 MB, avail 22886 MB
> 2015-05-18 11:17:48.454379 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28480 MB, avail 22694 MB
> 2015-05-18 11:19:24.178109 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28288 MB, avail 22886 MB
> 2015-05-18 11:20:25.370240 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 44% total 51175 MB, used 28536 MB, avail 22638 MB
> 2015-05-18 11:21:25.424047 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 43% total 51175 MB, used 28712 MB, avail 22462 MB
> 2015-05-18 11:22:27.585467 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 43% total 51175 MB, used 28978 MB, avail 22196 MB
> 2015-05-18 11:23:27.666819 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 42% total 51175 MB, used 29197 MB, avail 21977 MB
> 2015-05-18 11:24:30.853669 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 42% total 51175 MB, used 29429 MB, avail 21745 MB
> 2015-05-18 11:25:32.661380 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 42% total 51175 MB, used 29672 MB, avail 21502 MB
> 2015-05-18 11:26:32.979662 7f4a9609d700  0 
> mon.monitor03@2(synchronizing).data_health(0)
> update_stats avail 41% total 51175 MB, used 29716 MB, avail 21458 MB
> 201

Re: [ceph-users] Adding new CEPH monitor keep SYNCHRONIZING

2015-05-18 Thread Ali Hussein
The two old Monitors uses Ceph version 0.87.1 , while the new added 
Monitor uses 0.87.2

P.S:- ntp is installed and working fine

On 18/05/2015 12:11 م, Mohamed Pakkeer wrote:

Hi Ali,

Which version of Ceph are you using?. Is there any re-spawning osds?

Regards
K.Mohamed Pakkeer

On Mon, May 18, 2015 at 2:23 PM, Ali Hussein 
> wrote:


*Hi all*

I have two ceph monitors working fine , i have added them a while
ago, for now i have added a new Ceph Monitor and it does showing
me the following log file

2015-05-18 10:54:42.585123 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28280 MB, avail 22894 MB
2015-05-18 10:55:44.418861 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28437 MB, avail 22737 MB
2015-05-18 10:56:45.442884 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28287 MB, avail 22887 MB
2015-05-18 10:57:47.710088 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28449 MB, avail 22725 MB
2015-05-18 10:59:13.436988 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28266 MB, avail 22908 MB
2015-05-18 11:00:15.069245 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28511 MB, avail 22663 MB
2015-05-18 11:01:46.333054 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28285 MB, avail 22889 MB
2015-05-18 11:02:48.268613 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28521 MB, avail 22653 MB
2015-05-18 11:04:21.107442 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28287 MB, avail 22887 MB
2015-05-18 11:05:24.336678 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28552 MB, avail 22622 MB
2015-05-18 11:07:02.355146 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28266 MB, avail 22908 MB
2015-05-18 11:08:04.168761 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28527 MB, avail 22647 MB
2015-05-18 11:09:25.942629 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28296 MB, avail 22878 MB
2015-05-18 11:10:28.410838 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28555 MB, avail 22619 MB
2015-05-18 11:12:06.534287 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28284 MB, avail 22890 MB
2015-05-18 11:13:09.433899 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28337 MB, avail 22837 MB
2015-05-18 11:14:09.485415 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28297 MB, avail 22877 MB
2015-05-18 11:15:13.061472 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28520 MB, avail 22654 MB
2015-05-18 11:16:47.296862 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28288 MB, avail 22886 MB
2015-05-18 11:17:48.454379 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28480 MB, avail 22694 MB
2015-05-18 11:19:24.178109 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28288 MB, avail 22886 MB
2015-05-18 11:20:25.370240 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
44% total 51175 MB, used 28536 MB, avail 22638 MB
2015-05-18 11:21:25.424047 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
43% total 51175 MB, used 28712 MB, avail 22462 MB
2015-05-18 11:22:27.585467 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
43% total 51175 MB, used 28978 MB, avail 22196 MB
2015-05-18 11:23:27.666819 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
42% total 51175 MB, used 29197 MB, avail 21977 MB
2015-05-18 11:24:30.853669 7f4a9609d700  0
mon.monitor03@2(synchronizing).data_health(0) update_stats avail
42% total 51175 MB, used 29429 MB, avail 21745 MB
2015-05-1

Re: [ceph-users] Adding new CEPH monitor keep SYNCHRONIZING

2015-05-18 Thread Joao Eduardo Luis
On 05/18/2015 10:33 AM, Ali Hussein wrote:
> The two old Monitors uses Ceph version 0.87.1 , while the new added
> Monitor uses 0.87.2
> P.S:- ntp is installed and working fine

This is not related with clocks (or, at least, should not be).

State 'synchronizing' means the monitor is getting its store
synchronized from the other monitors, so that it gets to a consistent
state with the remaining monitors in order to form quorum.

If your monitor stores on the already in-quorum monitors is too big
(1G+) it may take a while.  When the monitor stores are several GBs in
size, dozens even (this is your case), this will be a hard task that may
need some fine tuning.

The upside is that when a store is big enough to cause this sort of
problems during synchronization, it also usually means the cluster is in
bad shape.  Usually due to lots of osdmaps in the monitor stores, which
is a symptom of an unclean, unhealthy cluster.

If you have an unhealthy cluster, you should first try stabilizing the
cluster and get a HEALTH_OK.  After that, the monitors will trim
no-longer-necessary maps and synchronization will be faster.  Once you
reach HEALTH_OK, restart the monitor you are trying to get into the
cluster and synchronization should run just fine.

  -Joao

> 
> On 18/05/2015 12:11 م, Mohamed Pakkeer wrote:
>> Hi Ali,
>>
>> Which version of Ceph are you using?. Is there any re-spawning osds?
>>
>> Regards
>> K.Mohamed Pakkeer
>>
>> On Mon, May 18, 2015 at 2:23 PM, Ali Hussein
>> > > wrote:
>>
>> *Hi all*
>>
>> I have two ceph monitors working fine , i have added them a while
>> ago, for now i have added a new Ceph Monitor and it does showing
>> me the following log file
>>
>> 2015-05-18 10:54:42.585123 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28280 MB, avail 22894 MB
>> 2015-05-18 10:55:44.418861 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28437 MB, avail 22737 MB
>> 2015-05-18 10:56:45.442884 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28287 MB, avail 22887 MB
>> 2015-05-18 10:57:47.710088 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28449 MB, avail 22725 MB
>> 2015-05-18 10:59:13.436988 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28266 MB, avail 22908 MB
>> 2015-05-18 11:00:15.069245 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28511 MB, avail 22663 MB
>> 2015-05-18 11:01:46.333054 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28285 MB, avail 22889 MB
>> 2015-05-18 11:02:48.268613 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28521 MB, avail 22653 MB
>> 2015-05-18 11:04:21.107442 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28287 MB, avail 22887 MB
>> 2015-05-18 11:05:24.336678 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28552 MB, avail 22622 MB
>> 2015-05-18 11:07:02.355146 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28266 MB, avail 22908 MB
>> 2015-05-18 11:08:04.168761 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28527 MB, avail 22647 MB
>> 2015-05-18 11:09:25.942629 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28296 MB, avail 22878 MB
>> 2015-05-18 11:10:28.410838 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28555 MB, avail 22619 MB
>> 2015-05-18 11:12:06.534287 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28284 MB, avail 22890 MB
>> 2015-05-18 11:13:09.433899 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28337 MB, avail 22837 MB
>> 2015-05-18 11:14:09.485415 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28297 MB, avail 22877 MB
>> 2015-05-18 11:15:13.061472 7f4a9609d700  0
>> mon.monitor03@2(synchronizing).data_health(0) update_stats avail
>> 44% total 51175 MB, used 28520 MB, avail 22654 MB
>> 2015-05-18 11:16:47.2968

Re: [ceph-users] new relic ceph plugin

2015-05-18 Thread German Anders
Thanks a lot John, will definitely take a look on that.

Best regards,


*German Anders*
Storage System Engineer Leader
*Despegar* | IT Team
*office* +54 11 4894 3500 x3408
*mobile* +54 911 3493 7262
*mail* gand...@despegar.com

2015-05-18 6:04 GMT-03:00 John Spray :

> Not that I know of, but if you wanted to repurpose this code it would
> probably be pretty easy:
> https://github.com/ceph/Diamond/blob/calamari/src/collectors/ceph/ceph.py
>
> Cheers,
> John
>
>
> On 17/05/2015 23:19, German Anders wrote:
>
>> Hi all,
>>
>> I want to know if someone has deploy some new relic (pyhon) plugin for
>> Ceph.
>>
>> Thanks a lot,
>>
>> Best regards,
>>
>> **
>>
>> *Ger*
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cache Pool Flush/Eviction Limits - Hard of Soft?

2015-05-18 Thread Nick Fisk
Just to update on this, I've been watching iostat across my Ceph nodes and I
can see something slightly puzzling happening and is most likely the cause
of the slow (>32s) requests I am getting.

During a client write-only IO stream, I see reads and writes to the cache
tier, which is normal as blocks are being promoted/demoted. The latency does
suffer, but not excessively and is acceptable for data that has fallen out
of cache.

However, every now and again it appears that one of the OSD's suddenly just
starts aggressively reading and appears to block any IO until that read has
finished. Example below where /dev/sdd is a 10K disk in the cache tier. All
other nodes have their /dev/sdd devices being completely idle during this
period. The disks on the base tier seem to be doing writes during this
period, so looks related to some sort of flushing.

Device  rrqm/s  wrqm/s r/s  w/s rkB/s   wkB/s   rq-sz   qu-sz
await   r_wait  w_wait  svctm   util
sdd 0.000.00471.50  0.002680.00  0.00   11.37   0.962.03
2.030.001.9089.80

Most of the times I observed this whilst I was watching iostat, the read
only lasted around 5-10s, but I suspect that sometimes it is going on for
longer and is the cause of the "requests are blocked errors". I have also
noticed that this appears to happen more often depending on if there are a
greater number of blocks to be promoted/demoted. Other pools are not
affected during these hangs.

>From the look of the iostat stats, I would assume that for a 10k disk, it
must be doing a sequential read to get that number of IO's.

Does anybody have any clue what might be going on?

Nick

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: 30 April 2015 12:53
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Cache Pool Flush/Eviction Limits - Hard of Soft?
> 
> Does anyone know if the Flush and Eviction limits are hard limits, ie as
soon as
> they are exceeded writes will block, or will the pool only block when it
> reaches Target_max_bytes?
> 
> I'm see really poor performance and frequent requests are blocked
> messages once data starts having to be evicted/flushed and I was just
> wondering if the above was true.
> 
> If the limits are soft, I would imagine making high and low target limits
would
> help:-
> 
> Target_dirty_bytes_low=.3
> Target_dirty_bytes_high=.4
> 
> Once the amount of dirty bytes passes the low limit a very low priority
flush
> occurs, if the high limit is reached data is flushed much more
aggressively.
> The same could also exist for eviction. This will allow burst of write
activity to
> occur before flushing starts heavily impacting performance.
> 
> Nick
> 


 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Hammer cache behavior

2015-05-18 Thread Brian Rak
We just enabled a small cache pool on one of our clusters (v 0.94.1) and 
have run into some issues:


1) Cache population appears to happen via the public network (not the 
cluster network).  We're seeing basically no traffic on the cluster 
network, and multiple gigabits inbound to our cache OSDs. Normal 
rebuild/recovery happens via the cluster network, so I don't believe 
this is just a configuration issue.


2) Similar to #1, I was expecting to see cache traffic show up as repair 
traffic in 'ceph status'.  Instead, it seems to appear as a client traffic.


3) We're using a readonly pool (we only really write to our pools 
once).  I noticed that if all the OSDs hosting the cache pool go down, 
all reads stop until they're restored.  I would have expected that reads 
would fall back to the backing pool if the cache pool is unavailable.  
Is this how it's supposed to work?


Any thoughts on these?  Are my expectations just wrong here?  The 
documentation is fairly sparse, so I'm not quite sure what to expect.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD unable to start (giant -> hammer)

2015-05-18 Thread Berant Lemmenes
Hello all,

I've encountered a problem when upgrading my single node home cluster from
giant to hammer, and I would greatly appreciate any insight.

I upgraded the packages like normal, then proceeded to restart the mon and
once that came back restarted the first OSD (osd.3). However it
subsequently won't start and crashes with the following failed assertion:

osd/OSD.h: 716: FAILED assert(ret)

 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x7f) [0xb1784f]

 2: (OSD::load_pgs()+0x277b) [0x6850fb]

 3: (OSD::init()+0x1448) [0x6930b8]

 4: (main()+0x26b9) [0x62fd89]

 5: (__libc_start_main()+0xed) [0x7f2345bc976d]

 6: ceph-osd() [0x635679]

 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.


--- logging levels ---

   0/ 5 none

   0/ 1 lockdep

   0/ 1 context

   1/ 1 crush

   1/ 5 mds

   1/ 5 mds_balancer

   1/ 5 mds_locker

   1/ 5 mds_log

   1/ 5 mds_log_expire

   1/ 5 mds_migrator

   0/ 1 buffer

   0/ 1 timer

   0/ 1 filer

   0/ 1 striper

   0/ 1 objecter

   0/ 5 rados

   0/ 5 rbd

   0/ 5 rbd_replay

   0/ 5 journaler

   0/ 5 objectcacher

   0/ 5 client

   0/ 5 osd

   0/ 5 optracker

   0/ 5 objclass

   1/ 3 filestore

   1/ 3 keyvaluestore

   1/ 3 journal

   0/ 5 ms

   1/ 5 mon

   0/10 monc

   1/ 5 paxos

   0/ 5 tp

   1/ 5 auth

   1/ 5 crypto

   1/ 1 finisher

   1/ 5 heartbeatmap

   1/ 5 perfcounter

   1/ 5 rgw

   1/10 civetweb

   1/ 5 javaclient

   1/ 5 asok

   1/ 1 throttle

   0/ 0 refs

   1/ 5 xio

  -2/-2 (syslog threshold)

  99/99 (stderr threshold)

  max_recent 1

  max_new 1000

  log_file

--- end dump of recent events ---

terminate called after throwing an instance of 'ceph::FailedAssertion'

*** Caught signal (Aborted) **

 in thread 7f2347f71780

 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

 1: ceph-osd() [0xa1fe55]

 2: (()+0xfcb0) [0x7f2346fb1cb0]

 3: (gsignal()+0x35) [0x7f2345bde0d5]

 4: (abort()+0x17b) [0x7f2345be183b]

 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]

 6: (()+0xb5846) [0x7f234652d846]

 7: (()+0xb5873) [0x7f234652d873]

 8: (()+0xb596e) [0x7f234652d96e]

 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x259) [0xb17a29]

 10: (OSD::load_pgs()+0x277b) [0x6850fb]

 11: (OSD::init()+0x1448) [0x6930b8]

 12: (main()+0x26b9) [0x62fd89]

 13: (__libc_start_main()+0xed) [0x7f2345bc976d]

 14: ceph-osd() [0x635679]

2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) **

 in thread 7f2347f71780


 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

 1: ceph-osd() [0xa1fe55]

 2: (()+0xfcb0) [0x7f2346fb1cb0]

 3: (gsignal()+0x35) [0x7f2345bde0d5]

 4: (abort()+0x17b) [0x7f2345be183b]

 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]

 6: (()+0xb5846) [0x7f234652d846]

 7: (()+0xb5873) [0x7f234652d873]

 8: (()+0xb596e) [0x7f234652d96e]

 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x259) [0xb17a29]

 10: (OSD::load_pgs()+0x277b) [0x6850fb]

 11: (OSD::init()+0x1448) [0x6930b8]

 12: (main()+0x26b9) [0x62fd89]

 13: (__libc_start_main()+0xed) [0x7f2345bc976d]

 14: ceph-osd() [0x635679]

 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.


--- begin dump of recent events ---

 0> 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal
(Aborted) **

 in thread 7f2347f71780


 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)

 1: ceph-osd() [0xa1fe55]

 2: (()+0xfcb0) [0x7f2346fb1cb0]

 3: (gsignal()+0x35) [0x7f2345bde0d5]

 4: (abort()+0x17b) [0x7f2345be183b]

 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]

 6: (()+0xb5846) [0x7f234652d846]

 7: (()+0xb5873) [0x7f234652d873]

 8: (()+0xb596e) [0x7f234652d96e]

 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x259) [0xb17a29]

 10: (OSD::load_pgs()+0x277b) [0x6850fb]

 11: (OSD::init()+0x1448) [0x6930b8]

 12: (main()+0x26b9) [0x62fd89]

 13: (__libc_start_main()+0xed) [0x7f2345bc976d]

 14: ceph-osd() [0x635679]

 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.


--- logging levels ---

   0/ 5 none

   0/ 1 lockdep

   0/ 1 context

   1/ 1 crush

   1/ 5 mds

   1/ 5 mds_balancer

   1/ 5 mds_locker

   1/ 5 mds_log

   1/ 5 mds_log_expire

   1/ 5 mds_migrator

   0/ 1 buffer

   0/ 1 timer

   0/ 1 filer

   0/ 1 striper

   0/ 1 objecter

   0/ 5 rados

   0/ 5 rbd

   0/ 5 rbd_replay

   0/ 5 journaler

   0/ 5 objectcacher

   0/ 5 client

   0/ 5 osd

   0/ 5 optracker

   0/ 5 objclass

   1/ 3 filestore

   1/ 3 keyvaluestore

   1/ 3 journal

   0/ 5 ms

   1/ 5 mon

   0/10 monc

   1/ 5 paxos

   0/ 5 tp

   1/ 5 auth

   1/ 5 crypto

   1/ 1 finisher

   1/ 5 heartbeatmap

   1/ 5 perfcounter

   1/ 5 rgw

   1/10 civetweb

   1/ 5 javaclient

   1/ 5 asok

   

Re: [ceph-users] OSD unable to start (giant -> hammer)

2015-05-18 Thread Samuel Just
You have most likely hit http://tracker.ceph.com/issues/11429.  There are some 
workarounds in the bugs marked as duplicates of that bug, or you can wait for 
the next hammer point release.
-Sam

- Original Message -
From: "Berant Lemmenes" 
To: ceph-users@lists.ceph.com
Sent: Monday, May 18, 2015 10:24:38 AM
Subject: [ceph-users] OSD unable to start (giant -> hammer)

Hello all, 

I've encountered a problem when upgrading my single node home cluster from 
giant to hammer, and I would greatly appreciate any insight. 

I upgraded the packages like normal, then proceeded to restart the mon and once 
that came back restarted the first OSD (osd.3). However it subsequently won't 
start and crashes with the following failed assertion: 



osd/OSD.h: 716: FAILED assert(ret) 

ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) 

1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) 
[0xb1784f] 

2: (OSD::load_pgs()+0x277b) [0x6850fb] 

3: (OSD::init()+0x1448) [0x6930b8] 

4: (main()+0x26b9) [0x62fd89] 

5: (__libc_start_main()+0xed) [0x7f2345bc976d] 

6: ceph-osd() [0x635679] 

NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this. 




--- logging levels --- 

0/ 5 none 

0/ 1 lockdep 

0/ 1 context 

1/ 1 crush 

1/ 5 mds 

1/ 5 mds_balancer 

1/ 5 mds_locker 

1/ 5 mds_log 

1/ 5 mds_log_expire 

1/ 5 mds_migrator 

0/ 1 buffer 

0/ 1 timer 

0/ 1 filer 

0/ 1 striper 

0/ 1 objecter 

0/ 5 rados 

0/ 5 rbd 

0/ 5 rbd_replay 

0/ 5 journaler 

0/ 5 objectcacher 

0/ 5 client 

0/ 5 osd 

0/ 5 optracker 

0/ 5 objclass 

1/ 3 filestore 

1/ 3 keyvaluestore 

1/ 3 journal 

0/ 5 ms 

1/ 5 mon 

0/10 monc 

1/ 5 paxos 

0/ 5 tp 

1/ 5 auth 

1/ 5 crypto 

1/ 1 finisher 

1/ 5 heartbeatmap 

1/ 5 perfcounter 

1/ 5 rgw 

1/10 civetweb 

1/ 5 javaclient 

1/ 5 asok 

1/ 1 throttle 

0/ 0 refs 

1/ 5 xio 

-2/-2 (syslog threshold) 

99/99 (stderr threshold) 

max_recent 1 

max_new 1000 

log_file 

--- end dump of recent events --- 

terminate called after throwing an instance of 'ceph::FailedAssertion' 

*** Caught signal (Aborted) ** 

in thread 7f2347f71780 

ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) 

1: ceph-osd() [0xa1fe55] 

2: (()+0xfcb0) [0x7f2346fb1cb0] 

3: (gsignal()+0x35) [0x7f2345bde0d5] 

4: (abort()+0x17b) [0x7f2345be183b] 

5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] 

6: (()+0xb5846) [0x7f234652d846] 

7: (()+0xb5873) [0x7f234652d873] 

8: (()+0xb596e) [0x7f234652d96e] 

9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x259) 
[0xb17a29] 

10: (OSD::load_pgs()+0x277b) [0x6850fb] 

11: (OSD::init()+0x1448) [0x6930b8] 

12: (main()+0x26b9) [0x62fd89] 

13: (__libc_start_main()+0xed) [0x7f2345bc976d] 

14: ceph-osd() [0x635679] 

2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) ** 

in thread 7f2347f71780 




ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) 

1: ceph-osd() [0xa1fe55] 

2: (()+0xfcb0) [0x7f2346fb1cb0] 

3: (gsignal()+0x35) [0x7f2345bde0d5] 

4: (abort()+0x17b) [0x7f2345be183b] 

5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] 

6: (()+0xb5846) [0x7f234652d846] 

7: (()+0xb5873) [0x7f234652d873] 

8: (()+0xb596e) [0x7f234652d96e] 

9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x259) 
[0xb17a29] 

10: (OSD::load_pgs()+0x277b) [0x6850fb] 

11: (OSD::init()+0x1448) [0x6930b8] 

12: (main()+0x26b9) [0x62fd89] 

13: (__libc_start_main()+0xed) [0x7f2345bc976d] 

14: ceph-osd() [0x635679] 

NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this. 




--- begin dump of recent events --- 

0> 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) ** 

in thread 7f2347f71780 




ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) 

1: ceph-osd() [0xa1fe55] 

2: (()+0xfcb0) [0x7f2346fb1cb0] 

3: (gsignal()+0x35) [0x7f2345bde0d5] 

4: (abort()+0x17b) [0x7f2345be183b] 

5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d] 

6: (()+0xb5846) [0x7f234652d846] 

7: (()+0xb5873) [0x7f234652d873] 

8: (()+0xb596e) [0x7f234652d96e] 

9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x259) 
[0xb17a29] 

10: (OSD::load_pgs()+0x277b) [0x6850fb] 

11: (OSD::init()+0x1448) [0x6930b8] 

12: (main()+0x26b9) [0x62fd89] 

13: (__libc_start_main()+0xed) [0x7f2345bc976d] 

14: ceph-osd() [0x635679] 

NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this. 




--- logging levels --- 

0/ 5 none 

0/ 1 lockdep 

0/ 1 context 

1/ 1 crush 

1/ 5 mds 

1/ 5 mds_balancer 

1/ 5 mds_locker 

1/ 5 mds_log 

1/ 5 mds_log_expire 

1/ 5 mds_migrator 

0/ 1 buffer 

0/ 1 timer 

0/ 1 filer 

0/ 1 striper 

0/ 1 objecter 

0/ 5 rados 

0/ 5 rbd 

0/ 5 rbd_replay 

0/ 5 journaler 

0/ 5 objectcacher 

0/ 5 client 

0/ 5 osd 

0/ 5 optracker 

0/ 5 objclass 

1/ 3 filestore 

1/ 3 keyval

Re: [ceph-users] OSD unable to start (giant -> hammer)

2015-05-18 Thread Berant Lemmenes
Sam,

Thanks for taking a look. It does seem to fit my issue. Would just removing
the 5.0_head directory be appropriate or would using ceph-objectstore-tool
be better?

Thanks,
Berant

On Mon, May 18, 2015 at 1:47 PM, Samuel Just  wrote:

> You have most likely hit http://tracker.ceph.com/issues/11429.  There are
> some workarounds in the bugs marked as duplicates of that bug, or you can
> wait for the next hammer point release.
> -Sam
>
> - Original Message -
> From: "Berant Lemmenes" 
> To: ceph-users@lists.ceph.com
> Sent: Monday, May 18, 2015 10:24:38 AM
> Subject: [ceph-users] OSD unable to start (giant -> hammer)
>
> Hello all,
>
> I've encountered a problem when upgrading my single node home cluster from
> giant to hammer, and I would greatly appreciate any insight.
>
> I upgraded the packages like normal, then proceeded to restart the mon and
> once that came back restarted the first OSD (osd.3). However it
> subsequently won't start and crashes with the following failed assertion:
>
>
>
> osd/OSD.h: 716: FAILED assert(ret)
>
> ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
>
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x7f) [0xb1784f]
>
> 2: (OSD::load_pgs()+0x277b) [0x6850fb]
>
> 3: (OSD::init()+0x1448) [0x6930b8]
>
> 4: (main()+0x26b9) [0x62fd89]
>
> 5: (__libc_start_main()+0xed) [0x7f2345bc976d]
>
> 6: ceph-osd() [0x635679]
>
> NOTE: a copy of the executable, or `objdump -rdS ` is needed
> to interpret this.
>
>
>
>
> --- logging levels ---
>
> 0/ 5 none
>
> 0/ 1 lockdep
>
> 0/ 1 context
>
> 1/ 1 crush
>
> 1/ 5 mds
>
> 1/ 5 mds_balancer
>
> 1/ 5 mds_locker
>
> 1/ 5 mds_log
>
> 1/ 5 mds_log_expire
>
> 1/ 5 mds_migrator
>
> 0/ 1 buffer
>
> 0/ 1 timer
>
> 0/ 1 filer
>
> 0/ 1 striper
>
> 0/ 1 objecter
>
> 0/ 5 rados
>
> 0/ 5 rbd
>
> 0/ 5 rbd_replay
>
> 0/ 5 journaler
>
> 0/ 5 objectcacher
>
> 0/ 5 client
>
> 0/ 5 osd
>
> 0/ 5 optracker
>
> 0/ 5 objclass
>
> 1/ 3 filestore
>
> 1/ 3 keyvaluestore
>
> 1/ 3 journal
>
> 0/ 5 ms
>
> 1/ 5 mon
>
> 0/10 monc
>
> 1/ 5 paxos
>
> 0/ 5 tp
>
> 1/ 5 auth
>
> 1/ 5 crypto
>
> 1/ 1 finisher
>
> 1/ 5 heartbeatmap
>
> 1/ 5 perfcounter
>
> 1/ 5 rgw
>
> 1/10 civetweb
>
> 1/ 5 javaclient
>
> 1/ 5 asok
>
> 1/ 1 throttle
>
> 0/ 0 refs
>
> 1/ 5 xio
>
> -2/-2 (syslog threshold)
>
> 99/99 (stderr threshold)
>
> max_recent 1
>
> max_new 1000
>
> log_file
>
> --- end dump of recent events ---
>
> terminate called after throwing an instance of 'ceph::FailedAssertion'
>
> *** Caught signal (Aborted) **
>
> in thread 7f2347f71780
>
> ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
>
> 1: ceph-osd() [0xa1fe55]
>
> 2: (()+0xfcb0) [0x7f2346fb1cb0]
>
> 3: (gsignal()+0x35) [0x7f2345bde0d5]
>
> 4: (abort()+0x17b) [0x7f2345be183b]
>
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]
>
> 6: (()+0xb5846) [0x7f234652d846]
>
> 7: (()+0xb5873) [0x7f234652d873]
>
> 8: (()+0xb596e) [0x7f234652d96e]
>
> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x259) [0xb17a29]
>
> 10: (OSD::load_pgs()+0x277b) [0x6850fb]
>
> 11: (OSD::init()+0x1448) [0x6930b8]
>
> 12: (main()+0x26b9) [0x62fd89]
>
> 13: (__libc_start_main()+0xed) [0x7f2345bc976d]
>
> 14: ceph-osd() [0x635679]
>
> 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted) **
>
> in thread 7f2347f71780
>
>
>
>
> ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
>
> 1: ceph-osd() [0xa1fe55]
>
> 2: (()+0xfcb0) [0x7f2346fb1cb0]
>
> 3: (gsignal()+0x35) [0x7f2345bde0d5]
>
> 4: (abort()+0x17b) [0x7f2345be183b]
>
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]
>
> 6: (()+0xb5846) [0x7f234652d846]
>
> 7: (()+0xb5873) [0x7f234652d873]
>
> 8: (()+0xb596e) [0x7f234652d96e]
>
> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x259) [0xb17a29]
>
> 10: (OSD::load_pgs()+0x277b) [0x6850fb]
>
> 11: (OSD::init()+0x1448) [0x6930b8]
>
> 12: (main()+0x26b9) [0x62fd89]
>
> 13: (__libc_start_main()+0xed) [0x7f2345bc976d]
>
> 14: ceph-osd() [0x635679]
>
> NOTE: a copy of the executable, or `objdump -rdS ` is needed
> to interpret this.
>
>
>
>
> --- begin dump of recent events ---
>
> 0> 2015-05-18 13:02:33.643064 7f2347f71780 -1 *** Caught signal (Aborted)
> **
>
> in thread 7f2347f71780
>
>
>
>
> ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
>
> 1: ceph-osd() [0xa1fe55]
>
> 2: (()+0xfcb0) [0x7f2346fb1cb0]
>
> 3: (gsignal()+0x35) [0x7f2345bde0d5]
>
> 4: (abort()+0x17b) [0x7f2345be183b]
>
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f234652f69d]
>
> 6: (()+0xb5846) [0x7f234652d846]
>
> 7: (()+0xb5873) [0x7f234652d873]
>
> 8: (()+0xb596e) [0x7f234652d96e]
>
> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x259) [0xb17a29]
>
> 10: (OSD::load_pgs()+0x277b) [0x6850fb]
>
> 11: (OSD::init()+0x1448) [0x6930b8]
>
> 12: (main()+0x26b9) [0x62fd89]
>
> 13: (__libc_start_main()+0xed) [0x7f2345bc976d]
>
> 14: ceph-osd() [0x63

Re: [ceph-users] Radosgw startup failures & misdirected client requests

2015-05-18 Thread Abhishek L

[..]
> Seeing this in the firefly cluster as well. Tried a couple of rados
> commands on the .rgw.root pool this is what is happening:
>
> abhi@st:~$ sudo rados -p .rgw.root put test.txt test.txt
> error putting .rgw.root/test.txt: (6) No such device or address
>
> abhi@st:~$ sudo ceph osd map .rgw.root test.txt
> osdmap e83 pool '.rgw.root' (6) object 'test.txt' -> pg 6.8b0b6108
> (6.0) -> up ([1,2,0], p1) acting ([1,2,0], p1)
>
> abhi@st:~$ sudo ceph pg map 6.8b0b6108
> osdmap e83 pg 6.8b0b6108 (6.0) -> up [0,2,1] acting [0,2,1]
>
> Looks like the osd map says the object must go to primary osd as 1,
> whereas pg map says that the pg is hosted with 0 as primary.
>
[..]

Solved the problem; just posting it here in case anyone comes
across this same error.

Primarily the issue was due a misconfiguration from our config
management system, where `osd pool default pgp num` got set in ceph.conf
and `pg num` didn't, which led to the rgw pools having the default pg
num (8) and pgp_num set to a value of 128. Though trying out commands
like `ceph osd pool create` will fail without specifying the pg count;
`rados mkpool` does allow pool creation without the specification of pg
count and falling back to the default values; which probably explains
what happened to the .rgw.default pool.

An easy way to simulate this error would be to just do a setting like
`ceph tell mon.0 injectargs --osd_pool_default_pgp_num 128`

and then starting a fresh radosgw (assuming its not installed
previously); or creating any pool with rados commands, which will fail
when putting different objects because of the increased pgp count
compared to the pg count. 

-- 
Abhishek


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Avoid buckets creation

2015-05-18 Thread Florent MONTHEL
Hi List,

We would like to avoid creation of buckets directly by users or subuser (s3 or 
swift)
We would like to do it through administration interface (with user and special 
right) in order to normalize bucketname
Is it possibility to do it (with caps or parameter) ?
Thanks

Sent from my iPhone
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rados bench and Client io does not match

2015-05-18 Thread Barclay Jameson
Running a benchmark :
rados bench -p cephfs_data 300 write --no-cleanup

While watching ceph client io:
ceph -w

I get different numbers.

The output form rados bench is as follows:
Total time run: 300.108725
Total writes made:  66306
Write size: 4194304
Bandwidth (MB/sec): 883.760

Stddev Bandwidth:   65.2804
Max bandwidth (MB/sec): 968
Min bandwidth (MB/sec): 0
Average Latency:0.0724089
Stddev Latency: 0.0415512
Max latency:0.674297
Min latency:0.0299922

While watching the client io it shoots up to around 1900 MB/s:
2015-05-18 16:17:16.416059 mon.0 [INF] pgmap v774: 4096 pgs: 4096
active+clean; 203 GB data, 607 GB used, 173 TB / 174 TB avail; 1870 MB/s
wr, 467 op/s

The max bandwidth that rados reported was 968 MB/s.

The pool is rep 3 with 48 OSDs 4096 PGs.

Any thoughts?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Problem deploying a ceph cluster built from source

2015-05-18 Thread Aakanksha Pudipeddi-SSI
Hello all,

I am attempting to install a ceph cluster which has been built from source. I 
first cloned the Ceph master repository and then followed steps given in the 
Ceph documentation about installing a Ceph build. So I now have the binaries 
available in /usr/local/bin.

The next step is for me to deploy this build and I used ceph-deploy to do that:
$ceph-deploy install -dev=master 
This is where I get this error:
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ssd/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy install 
--dev=master Hostname
[ceph_deploy.install][DEBUG ] Installing dev version master on cluster ceph 
hosts Hostname
[ceph_deploy.install][DEBUG ] Detecting platform for host Hostname ...
[Hostname][DEBUG ] connection detected need for sudo
[Hostname][DEBUG ] connected to host: Hostname
[Hostname][DEBUG ] detect platform information from remote host
[Hostname][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: Ubuntu 14.04 trusty
[Hostname][INFO  ] installing ceph on Hostname
[Hostname][INFO  ] Running command: sudo env DEBIAN_FRONTEND=noninteractive 
apt-get -q install --assume-yes ca-certificates
[Hostname][DEBUG ] Reading package lists...
[Hostname][DEBUG ] Building dependency tree...
[Hostname][DEBUG ] Reading state information...
[Hostname][DEBUG ] ca-certificates is already the newest version.
[Hostname][DEBUG ] The following packages were automatically installed and are 
no longer required:
[Hostname][DEBUG ]   libcephfs1 librados2 librbd1 python-ceph python-flask 
python-itsdangerous
[Hostname][DEBUG ]   python-werkzeug
[Hostname][DEBUG ] Use 'apt-get autoremove' to remove them.
[Hostname][DEBUG ] 0 upgraded, 0 newly installed, 0 to remove and 389 not 
upgraded.
[Hostname][INFO  ] Running command: sudo wget -O autobuild.asc 
https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc
[Hostname][WARNIN] --2015-05-18 15:39:01--  
https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc
[Hostname][WARNIN] Resolving ceph.com (ceph.com)...
[Hostname][WARNIN] Connecting to ceph.com (ceph.com) ... connected.
[Hostname][WARNIN] ERROR: cannot verify ceph.com's certificate, issued by 
...
[Hostname][WARNIN]   Unable to locally verify the issuer's authority.
[Hostname][WARNIN] To connect to ceph.com insecurely, use 
`--no-check-certificate'.
[Hostname][WARNIN] command returned non-zero exit status: 5
[Hostname][INFO  ] Running command: sudo apt-key add autobuild.asc
[Hostname][WARNIN] gpg: no valid OpenPGP data found.
[Hostname][ERROR ] RuntimeError: command returned non-zero exit status: 2
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: apt-key add 
autobuild.asc

I have checked with ssl certfificates and ca-certificates and they are the 
newest version. I have deployed a cluster with ceph-deploy earlier using the 
Quick installation guide and I got this certificate error then as well. But I 
used the -no-check-certificate flag then and it worked well. While using the 
ceph-deploy install command during the quick installation process, I used 
-no-adjust-repos and the same certificate verify error disappeared. But the 
same flag does not work here as ceph-deploy install -dev and -no-adjust-repos 
do not go together.

I have two questions:

1.  Is this the right way to deploy a ceph cluster built from source? I am 
asking since it is not mentioned directly as an end-to-end document as to how 
to deploy a ceph cluster which has been built from source

2.  In cases where the ceph certificate is unverified, is it possible to 
issue something equivalent of the -no-check-certificate flag so that the 
installation does not stop because of an unverified certificate?

Thanks!
Aakanksha


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] client.radosgw.gateway for 2 radosgw servers

2015-05-18 Thread Florent MONTHEL
Hi List,

I would like to know the best way to have several radosgw servers on the same 
cluster with the same ceph.conf file

From now, I have 2 radosgw server but I have 1 conf file on each with below 
section on parrot :

[client.radosgw.gateway]
host = parrot
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = ""
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
rgw print continue = false
rgw enable usage log = true
rgw usage log tick interval = 30
rgw usage log flush threshold = 1024
rgw usage max shards = 32
rgw usage max user shards = 1

And below section on cougar node :

[client.radosgw.gateway]
host = cougar
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = ""
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
rgw print continue = false
rgw enable usage log = true
rgw usage log tick interval = 30
rgw usage log flush threshold = 1024
rgw usage max shards = 32
rgw usage max user shards = 1

Is it possible to have 2 different keys for parrot and cougar and 2 sections 
client.radosgw in order to have the same ceph.conf for whole cluster (and use 
cep-deploy to push conf) ?

Thanks ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Documentation regarding content of each pool of radosgw

2015-05-18 Thread Florent MONTHEL
Hi List,

I would like to know the content of each pools of radosgw in order to 
understand usage
So I have check the content with rados ls -p poolname

### .intent-log
=> this pool is empty on my side. what’s the need ?


### .log
=> this pool is empty on my side. what’s the need ?


### .rgw
=> bucketname and metadata to have id of the region and id of user owner ?

Exemple :
.bucket.meta.montest:default.69634.14
montest

default is the region right ?
14 is user id right ?
69634 what’s this id ? Id of pool ?


### .rgw.buckets
=> Objects in the pool

Exemple :
default.70632.2_Guzzle/Plugin/Cookie/CookieJar/CookieJarInterface.php

default is the region right ?
2 is owner id right ?
70632 what’s this id ? ID of pool ?


### .rgw.buckets.extra
=> this pool is empty on my side. what’s the need ?


### .rgw.buckets.index
=> content of buckets index

Exemple :
.dir.default.69634.12

default is the region right ?
12 is owner id right ?
69634 what’s this id ? ID of pool ?


### .rgw.control
=> don’t know…

Exemple :
notify.1
notify.5


### .rgw.gc
=> don’t know…

Exemple :
gc.9
gc.31


### .rgw.root
=> content of region

Example :
default.region
region_info.default
zone_info.default


### .usage
=> content of usage data. 1 object per user ?

Example :
usage.17

id 17 is id of user ?


### .users
=> content of access key. s3 only I think

Example :
Z78IS5F47QQJTB2DNVVC
HW698PZQDTZVLLO79NES
snausr016


### .users.email
=> email list


### .users.swift
=> subuser swift only

Example :
fmonthel:swift


### .users.uid
=> User id list with id.buckets

Example :
fmonthel
fmonthel.buckets

what is the need of fmonthel.buckets ?

Thanks for your helping___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com