Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Aegeaner
Yeah, three of nine OSDs went down but I recreated them, but the pgs 
cannot be recovered.


I don't know how to erase all the pgs, so I deleted all the osd pools, 
including data and metadata … Now all pgs are active and clean...


I'm not sure if there are more elegant ways to deal with this.

===
Aegeaner


在 2014-09-25 14:11, Irek Fasikhov 写道:

osd_op(client.4625.1:9005787)
.


This is due to external factors. For example, the network settings.

2014-09-25 10:05 GMT+04:00 Udo Lembke >:


Hi again,
sorry - forgot my post... see

osdmap e421: 9 osds: 9 up, 9 in

shows that all your 9 osds are up!

Do you have trouble with your journal/filesystem?

Udo

Am 25.09.2014 08:01, schrieb Udo Lembke:
> Hi,
> looks that some osds are down?!
>
> What is the output of "ceph osd tree"
>
> Udo
>
> Am 25.09.2014 04:29, schrieb Aegeaner:
>> The cluster healthy state is WARN:
>>
>>  health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
>> incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck
inactive;
>> 292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are
blocked
>> > 32 sec; recovery 12474/46357 objects degraded (26.909%)
>>  monmap e3: 3 mons at
>>   
 {CVM-0-mon01=172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0


},
>> election epoch 24, quorum 0,1,2
CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
>>  osdmap e421: 9 osds: 9 up, 9 in
>>   pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178
objects
>> 330 MB used, 3363 GB / 3363 GB avail
>> 12474/46357 objects degraded (26.909%)
>>   20 stale+peering
>>   87 stale+active+clean
>>8 stale+down+peering
>>   59 stale+incomplete
>>  118 stale+active+degraded
>>
>>
>> What does these errors mean? Can these PGs be recovered?
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Replies Inline :

Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

-Original Message-
From: Sage Weil [mailto:sw...@redhat.com]
Sent: Wednesday, September 24, 2014 6:10 PM
To: Sahana Lokeshappa
Cc: Varada Kari; ceph-us...@ceph.com
Subject: RE: [Ceph-community] Pgs are in stale+down+peering state

On Wed, 24 Sep 2014, Sahana Lokeshappa wrote:
> 2.a9518 0   0   0   0   2172649472  3001
> 3001active+clean2014-09-22 17:49:35.357586  6826'35762
> 17842:72706 [12,7,28]   12  [12,7,28]   12
> 6826'35762
> 2014-09-22 11:33:55.985449  0'0 2014-09-16 20:11:32.693864

Can you verify that 2.a9 exists in teh data directory for 12, 7, and/or 28?  If 
so the next step would be to enable logging (debug osd = 20, debug ms = 1) and 
see wy peering is stuck...

Yes 2.a9 directories are present in osd.12, 7 ,28

and 0.49 0.4d and 0.1c directories are not present in respective acting osds.


Here are the logs I can see when debugs were raised to 20


2014-09-24 18:38:41.706566 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:41.706586 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:41.706592 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [476de738//0//-1,f38//0//-1)
2014-09-24 18:38:41.711778 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 23 objects deeply
2014-09-24 18:38:41.730881 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.73 7f92eede0700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.822444 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.822519 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.878894 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.878921 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.918307 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.918426 7f92eede0700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.951678 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:41.951709 7f92eede0700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:42.064759 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.107016 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.107032 7f92eede0700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.109356 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109372 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109373 7f92f15e5700 20 osd.12 17850 _dispatch 0xeb0d900 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5
2014-09-24 18:38:42.109378 7f92f15e5700 10 osd.12 17850 queueing MOSDRepScrub 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5
2014-09-24 18:38:42.109395 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109396 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109456 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:42.109522 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Hi Craig,

Sorry for late response. Somehow missed this mail.
All osds are up and running. There were no specific logs related to this 
activity.  And, there are no IOs running right now. Few osds were made in and 
out ,removed fully and recreated before these pgs coming to this stage.
I had tried restarting osds. It didn’t work.

Thanks
Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

From: Craig Lewis [mailto:cle...@centraldesktop.com]
Sent: Wednesday, September 24, 2014 5:44 AM
To: Sahana Lokeshappa
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

Is osd.12  doing anything strange?  Is it consuming lots of CPU or IO?  Is it 
flapping?   Writing any interesting logs?  Have you tried restarting it?

If that doesn't help, try the other involved osds: 56, 27, 6, 25, 23.  I doubt 
that it will help, but it won't hurt.



On Mon, Sep 22, 2014 at 11:21 AM, Varada Kari 
mailto:varada.k...@sandisk.com>> wrote:
Hi Sage,

To give more context on this problem,

This cluster has two pools rbd and user-created.

Osd.12 is a primary for some other PG’s , but the problem happens for these 
three  PG’s.

$ sudo ceph osd lspools
0 rbd,2 pool1,

$ sudo ceph -s
cluster 99ffc4a5-2811-4547-bd65-34c7d4c58758
 health HEALTH_WARN 3 pgs down; 3 pgs peering; 3 pgs stale; 3 pgs stuck 
inactive; 3 pgs stuck stale; 3 pgs stuck unclean; 1 requests are blocked > 32 
sec
monmap e1: 3 mons at 
{rack2-ram-1=10.242.42.180:6789/0,rack2-ram-2=10.242.42.184:6789/0,rack2-ram-3=10.242.42.188:6789/0},
 election epoch 2008, quorum 0,1,2 rack2-ram-1,rack2-ram-2,rack2-ram-3
 osdmap e17842: 64 osds: 64 up, 64 in
  pgmap v79729: 2148 pgs, 2 pools, 4135 GB data, 1033 kobjects
12504 GB used, 10971 GB / 23476 GB avail
2145 active+clean
   3 stale+down+peering

Snippet from pg dump:

2.a9518 0   0   0   0   2172649472  30013001
active+clean2014-09-22 17:49:35.357586  6826'35762  17842:72706 
[12,7,28]   12  [12,7,28]   12   6826'35762  2014-09-22 
11:33:55.985449  0'0 2014-09-16 20:11:32.693864
0.590   0   0   0   0   0   0   0   
active+clean2014-09-22 17:50:00.751218  0'0 17842:4472  
[12,41,2]   12  [12,41,2]   12  0'0 2014-09-22 16:47:09.315499  
 0'0 2014-09-16 12:20:48.618726
0.4d0   0   0   0   0   0   4   4   
stale+down+peering  2014-09-18 17:51:10.038247  186'4   11134:498   
[12,56,27]  12  [12,56,27]  12  186'42014-09-18 17:30:32.393188 
 0'0 2014-09-16 12:20:48.615322
0.490   0   0   0   0   0   0   0   
stale+down+peering  2014-09-18 17:44:52.681513  0'0 11134:498   
[12,6,25]   12  [12,6,25]   12  0'0  2014-09-18 17:16:12.986658 
 0'0 2014-09-16 12:20:48.614192
0.1c0   0   0   0   0   0   12  12  
stale+down+peering  2014-09-18 17:51:16.735549  186'12  11134:522   
[12,25,23]  12  [12,25,23]  12  186'12   2014-09-18 17:16:04.457863 
 186'10  2014-09-16 14:23:58.731465
2.17510 0   0   0   0   2139095040  30013001
active+clean2014-09-22 17:52:20.364754  6784'30742  17842:72033 
[12,27,23]  12  [12,27,23]  12   6784'30742  2014-09-22 
00:19:39.905291  0'0 2014-09-16 20:11:17.016299
2.7e8   508 0   0   0   0   2130706432  34333433
active+clean2014-09-22 17:52:20.365083  6702'21132  17842:64769 
[12,25,23]  12  [12,25,23]  12   6702'21132  2014-09-22 
17:01:20.546126  0'0 2014-09-16 14:42:32.079187
2.6a5   528 0   0   0   0   2214592512  28402840
active+clean2014-09-22 22:50:38.092084  6775'34416  17842:83221 
[12,58,0]   12  [12,58,0]   12   6775'34416  2014-09-22 
22:50:38.091989  0'0 2014-09-16 20:11:32.703368

And we couldn’t observe and peering events happening on the primary osd.

$ sudo ceph pg 0.49 query
Error ENOENT: i don't have pgid 0.49
$ sudo ceph pg 0.4d query
Error ENOENT: i don't have pgid 0.4d
$ sudo ceph pg 0.1c query
Error ENOENT: i don't have pgid 0.1c

Not able to explain why the peering was stuck. BTW, Rbd pool doesn’t contain 
any data.

Varada

From: Ceph-community 
[mailto:ceph-community-boun...@lists.ceph.com]
 On Behalf Of Sage Weil
Sent: Monday, September 22, 2014 10:44 PM
To: Sahana Lokeshappa; 
ceph-users@lists.ceph.com

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Hi All,

Here are the steps I followed, to get back all pgs to active+clean state. Still 
don't know what is the root cause for this pg state.

1. Force create pgs which are in stale+down+peering
2. Stop osd.12
3. Mark osd.12 as lost
4. Start osd.12
5. All pgs were back to active+clean state

Thanks
Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283 
sahana.lokesha...@sandisk.com


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sahana 
Lokeshappa
Sent: Thursday, September 25, 2014 1:26 PM
To: Sage Weil
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

Replies Inline :

Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

-Original Message-
From: Sage Weil [mailto:sw...@redhat.com]
Sent: Wednesday, September 24, 2014 6:10 PM
To: Sahana Lokeshappa
Cc: Varada Kari; ceph-us...@ceph.com
Subject: RE: [Ceph-community] Pgs are in stale+down+peering state

On Wed, 24 Sep 2014, Sahana Lokeshappa wrote:
> 2.a9518 0   0   0   0   2172649472  3001
> 3001active+clean2014-09-22 17:49:35.357586  6826'35762
> 17842:72706 [12,7,28]   12  [12,7,28]   12
> 6826'35762
> 2014-09-22 11:33:55.985449  0'0 2014-09-16 20:11:32.693864

Can you verify that 2.a9 exists in teh data directory for 12, 7, and/or 28?  If 
so the next step would be to enable logging (debug osd = 20, debug ms = 1) and 
see wy peering is stuck...

Yes 2.a9 directories are present in osd.12, 7 ,28

and 0.49 0.4d and 0.1c directories are not present in respective acting osds.


Here are the logs I can see when debugs were raised to 20


2014-09-24 18:38:41.706566 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:41.706586 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:41.706592 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [476de738//0//-1,f38//0//-1)
2014-09-24 18:38:41.711778 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 23 objects deeply
2014-09-24 18:38:41.730881 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.73 7f92eede0700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.822444 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.822519 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.878894 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.878921 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.918307 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.918426 7f92eede0700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.951678 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:41.951709 7f92eede0700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:42.064759 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.107016 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.107032 7f92eede0700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.109356 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109372 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109373 7f92f15e5700 20 osd.12 17850 _dispatch 0xeb0d900 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5
20

Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Micha Krause

Hi,

>  > That's strange.  3.13 is way before any changes that could have had any

such effect.  Can you by any chance try with older kernels to see where
it starts misbehaving for you?  3.12?  3.10?  3.8?


my crush tunables are set to bobtail, so I can't go bellow 3.9, I will try 3.12 
tomorrow and
report back.


Ok, I have tested 3.12.9 and it also hangs.

I have no other pre-build kernels to test :-(.

If I have to compile Kernels anyway I will test 3.16.3 as well :-/.


Micha Krause
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-25 Thread Alexandre DERUMIER
>>As Dieter asked, what replication level is this, I guess 1? 

Yes, replication x1 for theses benchmarks.

>>Now at 3 nodes and 6 OSDs you're getting about the performance of a single 
>>SSD, food for thought. 

yes, sure . I don't have more nodes to test, but I would like to known if it's 
scale more than 20k iops with more nodes.

but clearly, the cpu is the limit.



- Mail original - 

De: "Christian Balzer"  
À: ceph-users@lists.ceph.com 
Envoyé: Jeudi 25 Septembre 2014 06:50:31 
Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
IOPS 

On Wed, 24 Sep 2014 20:49:21 +0200 (CEST) Alexandre DERUMIER wrote: 

> >>What about writes with Giant? 
> 
> I'm around 
> - 4k iops (4k random) with 1osd (1 node - 1 osd) 
> - 8k iops (4k random) with 2 osd (1 node - 2 osd) 
> - 16K iops (4k random) with 4 osd (2 nodes - 2 osd by node) 
> - 22K iops (4k random) with 6 osd (3 nodes - 2 osd by node) 
> 
> Seem to scale, but I'm cpu bound on node (8 cores E5-2603 v2 @ 1.80GHz 
> 100% cpu for 2 osd) 
> 
You don't even need a full SSD cluster to see that Ceph has a lot of room 
for improvements, see my "Slow IOPS on RBD compared to journal and backing 
devices" thread in May. 

As Dieter asked, what replication level is this, I guess 1? 

Now at 3 nodes and 6 OSDs you're getting about the performance of a single 
SSD, food for thought. 

Christian 

> - Mail original - 
> 
> De: "Sebastien Han"  
> À: "Jian Zhang"  
> Cc: "Alexandre DERUMIER" , 
> ceph-users@lists.ceph.com Envoyé: Mardi 23 Septembre 2014 17:41:38 
> Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
> 2K IOPS 
> 
> What about writes with Giant? 
> 
> On 18 Sep 2014, at 08:12, Zhang, Jian  wrote: 
> 
> > Have anyone ever testing multi volume performance on a *FULL* SSD 
> > setup? We are able to get ~18K IOPS for 4K random read on a single 
> > volume with fio (with rbd engine) on a 12x DC3700 Setup, but only able 
> > to get ~23K (peak) IOPS even with multiple volumes. Seems the maximum 
> > random write performance we can get on the entire cluster is quite 
> > close to single volume performance. 
> > 
> > Thanks 
> > Jian 
> > 
> > 
> > -Original Message- 
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
> > Of Sebastien Han Sent: Tuesday, September 16, 2014 9:33 PM 
> > To: Alexandre DERUMIER 
> > Cc: ceph-users@lists.ceph.com 
> > Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go 
> > over 3, 2K IOPS 
> > 
> > Hi, 
> > 
> > Thanks for keeping us updated on this subject. 
> > dsync is definitely killing the ssd. 
> > 
> > I don't have much to add, I'm just surprised that you're only getting 
> > 5299 with 0.85 since I've been able to get 6,4K, well I was using the 
> > 200GB model, that might explain this. 
> > 
> > 
> > On 12 Sep 2014, at 16:32, Alexandre DERUMIER  
> > wrote: 
> > 
> >> here the results for the intel s3500 
> >>  
> >> max performance is with ceph 0.85 + optracker disabled. 
> >> intel s3500 don't have d_sync problem like crucial 
> >> 
> >> %util show almost 100% for read and write, so maybe the ssd disk 
> >> performance is the limit. 
> >> 
> >> I have some stec zeusram 8GB in stock (I used them for zfs zil), I'll 
> >> try to bench them next week. 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> INTEL s3500 
> >> --- 
> >> raw disk 
> >>  
> >> 
> >> randread: fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k 
> >> --iodepth=32 --group_reporting --invalidate=0 --name=abc 
> >> --ioengine=aio bw=288207KB/s, iops=72051 
> >> 
> >> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
> >> r_await w_await svctm %util sdb 0,00 0,00 73454,00 0,00 293816,00 
> >> 0,00 8,00 30,96 0,42 0,42 0,00 0,01 99,90 
> >> 
> >> randwrite: fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k 
> >> --iodepth=32 --group_reporting --invalidate=0 --name=abc 
> >> --ioengine=aio --sync=1 bw=48131KB/s, iops=12032 Device: rrqm/s 
> >> wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await 
> >> svctm %util sdb 0,00 0,00 0,00 24120,00 0,00 48240,00 4,00 2,08 0,09 
> >> 0,00 0,09 0,04 100,00 
> >> 
> >> 
> >> ceph 0.80 
> >> - 
> >> randread: no tuning: bw=24578KB/s, iops=6144 
> >> 
> >> 
> >> randwrite: bw=10358KB/s, iops=2589 
> >> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
> >> r_await w_await svctm %util sdb 0,00 373,00 0,00 8878,00 0,00 
> >> 34012,50 7,66 1,63 0,18 0,00 0,18 0,06 50,90 
> >> 
> >> 
> >> ceph 0.85 : 
> >> - 
> >> 
> >> randread : bw=41406KB/s, iops=10351 
> >> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
> >> r_await w_await svctm %util sdb 2,00 0,00 10425,00 0,00 41816,00 0,00 
> >> 8,02 1,36 0,13 0,13 0,00 0,07 75,90 
> >> 
> >> randwrite : bw=17204KB/s, iops=4301 
> >> 
> >> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
> >> r_await w_await svctm %util sd

Re: [ceph-users] bug: ceph-deploy does not support jumbo frame

2014-09-25 Thread yuelongguang
thanks. i have not configured switch.

i just know about it.









在 2014-09-25 12:38:48,"Irek Fasikhov"  写道:

You have configured the switch?


2014-09-25 5:07 GMT+04:00 yuelongguang :

hi,all
after i set mtu=9000,  ceph-deply waits reply all the time , 'detecting 
platform for host.'
 
how to know what commands  ceph-deploy need that osd to do?
 
thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







--

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Guys, 

Have done some testing with 3.16.3-031603-generic downloaded from Ubuntu utopic 
branch. The hang task problem is gone when using large block size (tested with 
1M and 4M) and I could no longer preproduce the hang tasks while doing 100 dd 
tests in a for loop. 

However, I can confirm that I am still getting hang tasks while working with a 
4K block size. The hang tasks start after about an hour, but they do not cause 
the server crash. After a while the dd test times out and continues with the 
loop. This is what I was running: 

for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K count=25K 
oflag=direct ; done 

The following test definately produces the hang tasks like these: 

[23160.549785] INFO: task dd:2033 blocked for more than 120 seconds. 
[23160.588364] Tainted: G OE 3.16.3-031603-generic #201409171435 
[23160.627998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message. 
[23160.706856] dd D 000b 0 2033 23859 0x 
[23160.706861] 88011cec78c8 0082 88011cec78d8 
88011cec7fd8 
[23160.706865] 000143c0 000143c0 88048661bcc0 
880113441440 
[23160.706868] 88011cec7898 88067fd54cc0 880113441440 
880113441440 
[23160.706871] Call Trace: 
[23160.706883] [] schedule+0x29/0x70 
[23160.706887] [] io_schedule+0x8f/0xd0 
[23160.706893] [] dio_await_completion+0x54/0xd0 
[23160.706897] [] do_blockdev_direct_IO+0x958/0xcc0 
[23160.706903] [] ? wake_up_bit+0x2e/0x40 
[23160.706908] [] ? jbd2_journal_dirty_metadata+0xc5/0x260 
[23160.706914] [] ? ext4_get_block_write+0x20/0x20 
[23160.706919] [] __blockdev_direct_IO+0x4c/0x50 
[23160.706922] [] ? ext4_get_block_write+0x20/0x20 
[23160.706928] [] ext4_ind_direct_IO+0xce/0x410 
[23160.706931] [] ? ext4_get_block_write+0x20/0x20 
[23160.706935] [] ext4_ext_direct_IO+0x1bb/0x2a0 
[23160.706938] [] ? __ext4_journal_stop+0x78/0xa0 
[23160.706942] [] ext4_direct_IO+0xec/0x1e0 
[23160.706946] [] ? __mark_inode_dirty+0x53/0x2d0 
[23160.706952] [] generic_file_direct_write+0xbb/0x180 
[23160.706957] [] ? mnt_clone_write+0x12/0x30 
[23160.706960] [] __generic_file_write_iter+0x2a7/0x350 
[23160.706963] [] ext4_file_write_iter+0x111/0x3d0 
[23160.706969] [] ? iov_iter_init+0x14/0x40 
[23160.706976] [] new_sync_write+0x7b/0xb0 
[23160.706978] [] vfs_write+0xc7/0x1f0 
[23160.706980] [] SyS_write+0x4f/0xb0 
[23160.706985] [] system_call_fastpath+0x1a/0x1f 
[23280.705400] INFO: task dd:2033 blocked for more than 120 seconds. 
[23280.745358] Tainted: G OE 3.16.3-031603-generic #201409171435 
[23280.785069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message. 
[23280.864158] dd D 000b 0 2033 23859 0x 
[23280.864164] 88011cec78c8 0082 88011cec78d8 
88011cec7fd8 
[23280.864167] 000143c0 000143c0 88048661bcc0 
880113441440 
[23280.864170] 88011cec7898 88067fd54cc0 880113441440 
880113441440 
[23280.864173] Call Trace: 
[23280.864185] [] schedule+0x29/0x70 
[23280.864197] [] io_schedule+0x8f/0xd0 
[23280.864203] [] dio_await_completion+0x54/0xd0 
[23280.864207] [] do_blockdev_direct_IO+0x958/0xcc0 
[23280.864213] [] ? wake_up_bit+0x2e/0x40 
[23280.864218] [] ? jbd2_journal_dirty_metadata+0xc5/0x260 
[23280.864224] [] ? ext4_get_block_write+0x20/0x20 
[23280.864229] [] __blockdev_direct_IO+0x4c/0x50 
[23280.864239] [] ? ext4_get_block_write+0x20/0x20 
[23280.864244] [] ext4_ind_direct_IO+0xce/0x410 
[23280.864247] [] ? ext4_get_block_write+0x20/0x20 
[23280.864251] [] ext4_ext_direct_IO+0x1bb/0x2a0 
[23280.864254] [] ? __ext4_journal_stop+0x78/0xa0 
[23280.864258] [] ext4_direct_IO+0xec/0x1e0 
[23280.864263] [] ? __mark_inode_dirty+0x53/0x2d0 
[23280.864268] [] generic_file_direct_write+0xbb/0x180 
[23280.864273] [] ? mnt_clone_write+0x12/0x30 
[23280.864284] [] __generic_file_write_iter+0x2a7/0x350 
[23280.864289] [] ext4_file_write_iter+0x111/0x3d0 
[23280.864295] [] ? iov_iter_init+0x14/0x40 
[23280.864300] [] new_sync_write+0x7b/0xb0 
[23280.864302] [] vfs_write+0xc7/0x1f0 
[23280.864307] [] SyS_write+0x4f/0xb0 
[23280.864314] [] system_call_fastpath+0x1a/0x1f 
[23400.861043] INFO: task dd:2033 blocked for more than 120 seconds. 
[23400.901529] Tainted: G OE 3.16.3-031603-generic #201409171435 
[23400.942255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message. 
[23401.020985] dd D 000b 0 2033 23859 0x 
[23401.020991] 88011cec78c8 0082 88011cec78d8 
88011cec7fd8 
[23401.020995] 000143c0 000143c0 88048661bcc0 
880113441440 
[23401.020997] 88011cec7898 88067fd54cc0 880113441440 
880113441440 
[23401.021001] Call Trace: 
[23401.021014] [] schedule+0x29/0x70 
[23401.021025] [] io_schedule+0x8f/0xd0 
[23401.021031] [] dio_await_completion+0x54/0xd0 
[23401.021035] [] do_blockdev_direct_IO+0x958/0xcc0 
[23401.021041] [] ? wake_up_bit+0x2e/0x40 
[23401.0210

[ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Pavel V. Kaygorodov
Hi!

16 pgs in our ceph cluster are in active+clean+replay state more then one day.
All clients are working fine.
Is this ok?

root@bastet-mon1:/# ceph -w
cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
 health HEALTH_OK
 monmap e3: 3 mons at 
{1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election epoch 
2570, quorum 0,1,2 1,2,3
 osdmap e3108: 16 osds: 16 up, 16 in
  pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
2066 GB used, 10879 GB / 12945 GB avail
8688 active+clean
  16 active+clean+replay
  client io 3237 kB/s wr, 68 op/s


root@bastet-mon1:/# ceph pg dump | grep replay
dumped all in format plain
0.fd0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628   
[0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 02:23:49.463704  
0'0 2014-09-23 02:23:49.463704
0.e80   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823   
[2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 14:37:32.910787  
0'0 2014-09-22 14:37:32.910787
0.aa0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451   
[0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 00:39:10.717363  
0'0 2014-09-23 00:39:10.717363
0.9c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917   
[0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 14:40:06.694479  
0'0 2014-09-22 14:40:06.694479
0.9a0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486   
[0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 01:14:55.825900  
0'0 2014-09-23 01:14:55.825900
0.910   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962   
[0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 14:37:44.652796  
0'0 2014-09-22 14:37:44.652796
0.8c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635   
[0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 01:52:52.390529  
0'0 2014-09-23 01:52:52.390529
0.8b0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636   
[2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 01:31:38.134466  
0'0 2014-09-23 01:31:38.134466
0.500   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801   
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 08:38:53.963779  
0'0 2014-09-13 10:27:26.977929
0.440   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819   
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 11:59:05.208164  
0'0 2014-09-20 11:59:05.208164
0.390   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827   
[0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 14:40:50.697850  
0'0 2014-09-22 14:40:50.697850
0.320   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719   
[2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 12:06:23.716480  
0'0 2014-09-20 12:06:23.716480
0.2c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540   
[0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 23:44:53.387815  
0'0 2014-09-22 23:44:53.387815
0.1f0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.651059  0'0 3108:2522   
[0,2,14,11][0,2,14,11]  0   0'0 2014-09-22 23:38:16.315755  
0'0 2014-09-22 23:38:16.315755
0.7 0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.848797  0'0 3108:1739   
[7,0,12,10][7,0,12,10]  7   0'0 2014-09-22 14:43:38.224718  
0'0 2014-09-22 14:43:38.224718
0.3 0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:08.885066  0'0 3108:1640   
[2,0,11,15][2,0,11,15]  2   0'0 2014-09-20 06:18:55.987318  
0'0 2014-09-20 06:18:55.987318

With best regards,
  Pavel.

___
ceph-users mailing list
cep

Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Right, I've stopped the tests because it is just getting ridiculous. Without 
rbd cache enabled, dd tests run extremely slow: 

dd if=/dev/zero of=/tmp/mount/1G bs=1M count=1000 oflag=direct 
230+0 records in 
230+0 records out 
241172480 bytes (241 MB) copied, 929.71 s, 259 kB/s 

Any thoughts why I am getting 250kb/s instead of expected 100MB/s+ with large 
block size? 

How do I investigate what's causing this crappy performance? 

Cheers 

Andrei 

- Original Message -

> From: "Andrei Mikhailovsky" 
> To: "Micha Krause" 
> Cc: ceph-users@lists.ceph.com
> Sent: Thursday, 25 September, 2014 10:58:07 AM
> Subject: Re: [ceph-users] Frequent Crashes on rbd to nfs gateway
> Server

> Guys,

> Have done some testing with 3.16.3-031603-generic downloaded from
> Ubuntu utopic branch. The hang task problem is gone when using large
> block size (tested with 1M and 4M) and I could no longer preproduce
> the hang tasks while doing 100 dd tests in a for loop.

> However, I can confirm that I am still getting hang tasks while
> working with a 4K block size. The hang tasks start after about an
> hour, but they do not cause the server crash. After a while the dd
> test times out and continues with the loop. This is what I was
> running:

> for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K
> count=25K oflag=direct ; done

> The following test definately produces the hang tasks like these:

> [23160.549785] INFO: task dd:2033 blocked for more than 120 seconds.
> [23160.588364] Tainted: G OE 3.16.3-031603-generic #201409171435
> [23160.627998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [23160.706856] dd D 000b 0 2033 23859 0x
> [23160.706861] 88011cec78c8 0082 88011cec78d8
> 88011cec7fd8
> [23160.706865] 000143c0 000143c0 88048661bcc0
> 880113441440
> [23160.706868] 88011cec7898 88067fd54cc0 880113441440
> 880113441440
> [23160.706871] Call Trace:
> [23160.706883] [] schedule+0x29/0x70
> [23160.706887] [] io_schedule+0x8f/0xd0
> [23160.706893] [] dio_await_completion+0x54/0xd0
> [23160.706897] [] do_blockdev_direct_IO+0x958/0xcc0
> [23160.706903] [] ? wake_up_bit+0x2e/0x40
> [23160.706908] [] ?
> jbd2_journal_dirty_metadata+0xc5/0x260
> [23160.706914] [] ? ext4_get_block_write+0x20/0x20
> [23160.706919] [] __blockdev_direct_IO+0x4c/0x50
> [23160.706922] [] ? ext4_get_block_write+0x20/0x20
> [23160.706928] [] ext4_ind_direct_IO+0xce/0x410
> [23160.706931] [] ? ext4_get_block_write+0x20/0x20
> [23160.706935] [] ext4_ext_direct_IO+0x1bb/0x2a0
> [23160.706938] [] ? __ext4_journal_stop+0x78/0xa0
> [23160.706942] [] ext4_direct_IO+0xec/0x1e0
> [23160.706946] [] ? __mark_inode_dirty+0x53/0x2d0
> [23160.706952] []
> generic_file_direct_write+0xbb/0x180
> [23160.706957] [] ? mnt_clone_write+0x12/0x30
> [23160.706960] []
> __generic_file_write_iter+0x2a7/0x350
> [23160.706963] [] ext4_file_write_iter+0x111/0x3d0
> [23160.706969] [] ? iov_iter_init+0x14/0x40
> [23160.706976] [] new_sync_write+0x7b/0xb0
> [23160.706978] [] vfs_write+0xc7/0x1f0
> [23160.706980] [] SyS_write+0x4f/0xb0
> [23160.706985] [] system_call_fastpath+0x1a/0x1f
> [23280.705400] INFO: task dd:2033 blocked for more than 120 seconds.
> [23280.745358] Tainted: G OE 3.16.3-031603-generic #201409171435
> [23280.785069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [23280.864158] dd D 000b 0 2033 23859 0x
> [23280.864164] 88011cec78c8 0082 88011cec78d8
> 88011cec7fd8
> [23280.864167] 000143c0 000143c0 88048661bcc0
> 880113441440
> [23280.864170] 88011cec7898 88067fd54cc0 880113441440
> 880113441440
> [23280.864173] Call Trace:
> [23280.864185] [] schedule+0x29/0x70
> [23280.864197] [] io_schedule+0x8f/0xd0
> [23280.864203] [] dio_await_completion+0x54/0xd0
> [23280.864207] [] do_blockdev_direct_IO+0x958/0xcc0
> [23280.864213] [] ? wake_up_bit+0x2e/0x40
> [23280.864218] [] ?
> jbd2_journal_dirty_metadata+0xc5/0x260
> [23280.864224] [] ? ext4_get_block_write+0x20/0x20
> [23280.864229] [] __blockdev_direct_IO+0x4c/0x50
> [23280.864239] [] ? ext4_get_block_write+0x20/0x20
> [23280.864244] [] ext4_ind_direct_IO+0xce/0x410
> [23280.864247] [] ? ext4_get_block_write+0x20/0x20
> [23280.864251] [] ext4_ext_direct_IO+0x1bb/0x2a0
> [23280.864254] [] ? __ext4_journal_stop+0x78/0xa0
> [23280.864258] [] ext4_direct_IO+0xec/0x1e0
> [23280.864263] [] ? __mark_inode_dirty+0x53/0x2d0
> [23280.864268] []
> generic_file_direct_write+0xbb/0x180
> [23280.864273] [] ? mnt_clone_write+0x12/0x30
> [23280.864284] []
> __generic_file_write_iter+0x2a7/0x350
> [23280.864289] [] ext4_file_write_iter+0x111/0x3d0
> [23280.864295] [] ? iov_iter_init+0x14/0x40
> [23280.864300] [] new_sync_write+0x7b/0xb0
> [23280.864302] [] vfs_write+0xc7/0x1f0
> [23280.864307] [] SyS_write+0x4f/0xb0
> [23280.864314] [] system_call_fastpath+0x1a/0x1f
> [23400.861

[ceph-users] ceph debian systemd

2014-09-25 Thread zorg

Hi,
I'm using ceph version 0.80.5

I trying to make work  a ceph cluster using debian and systemd

I have already manage to install ceph cluster on debian with sysinit 
without any problem


But after installing all, using ceph deploy without error

after rebooting not all my osd  start (they are not mount)
and what is more strange at each reboot, it 's not the same osd that 
start adn some start 10 min later


I ' ve this in the log

Sep 25 12:18:23 addceph3 systemd-udevd[437]: 
'/usr/sbin/ceph-disk-activate /dev/sdh1' [1005] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[476]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdq1' [1142]
Sep 25 12:18:23 addceph3 systemd-udevd[486]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdg1' [998]
Sep 25 12:18:23 addceph3 systemd-udevd[486]: 
'/usr/sbin/ceph-disk-activate /dev/sdg1' [998] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[476]: 
'/usr/sbin/ceph-disk-activate /dev/sdq1' [1142] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[458]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdi1' [1001]
Sep 25 12:18:23 addceph3 systemd-udevd[458]: 
'/usr/sbin/ceph-disk-activate /dev/sdi1' [1001] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[444]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdj1' [1006]
Sep 25 12:18:23 addceph3 systemd-udevd[460]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdk1' [1152]
Sep 25 12:18:23 addceph3 systemd-udevd[444]: 
'/usr/sbin/ceph-disk-activate /dev/sdj1' [1006] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[460]: 
'/usr/sbin/ceph-disk-activate /dev/sdk1' [1152] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[469]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdm1' [1110]
Sep 25 12:18:23 addceph3 systemd-udevd[470]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdp1' [1189]
Sep 25 12:18:23 addceph3 systemd-udevd[469]: 
'/usr/sbin/ceph-disk-activate /dev/sdm1' [1110] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[470]: 
'/usr/sbin/ceph-disk-activate /dev/sdp1' [1189] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[468]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdl1' [1177]
Sep 25 12:18:23 addceph3 systemd-udevd[447]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdo1' [1181]
Sep 25 12:18:23 addceph3 systemd-udevd[468]: 
'/usr/sbin/ceph-disk-activate /dev/sdl1' [1177] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[447]: 
'/usr/sbin/ceph-disk-activate /dev/sdo1' [1181] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[490]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdr1' [1160]
Sep 25 12:18:23 addceph3 systemd-udevd[490]: 
'/usr/sbin/ceph-disk-activate /dev/sdr1' [1160] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[445]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdn1' [1202]
Sep 25 12:18:23 addceph3 systemd-udevd[445]: 
'/usr/sbin/ceph-disk-activate /dev/sdn1' [1202] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 kernel: [   39.813701] XFS (sdo1): Mounting 
Filesystem
Sep 25 12:18:23 addceph3 kernel: [   39.854510] XFS (sdo1): Ending clean 
mount
Sep 25 12:22:59 addceph3 systemd[1]: ceph.service operation timed out. 
Terminating.
Sep 25 12:22:59 addceph3 systemd[1]: Failed to start LSB: Start Ceph 
distributed file system daemons at boot time.




I'm not actually very experimented with systemd
don't really how ceph handle systemd

if someone can give me a bit of information

thanks



--
probeSys - spécialiste GNU/Linux
site web : http://www.probesys.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Ilya Dryomov
On Thu, Sep 25, 2014 at 1:58 PM, Andrei Mikhailovsky  wrote:
> Guys,
>
> Have done some testing with 3.16.3-031603-generic downloaded from Ubuntu
> utopic branch. The hang task problem is gone when using large block size
> (tested with 1M and 4M) and I could no longer preproduce the hang tasks
> while doing 100 dd tests in a for loop.
>
>
>
> However, I can confirm that I am still getting hang tasks while working with
> a 4K block size. The hang tasks start after about an hour, but they do not
> cause the server crash. After a while the dd test times out and continues
> with the loop. This is what I was running:
>
> for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K count=25K
> oflag=direct ; done
>
> The following test definately produces the hang tasks like these:
>
> [23160.549785] INFO: task dd:2033 blocked for more than 120 seconds.
> [23160.588364]   Tainted: G   OE 3.16.3-031603-generic
> #201409171435
> [23160.627998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [23160.706856] dd  D 000b 0  2033  23859
> 0x
> [23160.706861]  88011cec78c8 0082 88011cec78d8
> 88011cec7fd8
> [23160.706865]  000143c0 000143c0 88048661bcc0
> 880113441440
> [23160.706868]  88011cec7898 88067fd54cc0 880113441440
> 880113441440
> [23160.706871] Call Trace:
> [23160.706883]  [] schedule+0x29/0x70
> [23160.706887]  [] io_schedule+0x8f/0xd0
> [23160.706893]  [] dio_await_completion+0x54/0xd0
> [23160.706897]  [] do_blockdev_direct_IO+0x958/0xcc0
> [23160.706903]  [] ? wake_up_bit+0x2e/0x40
> [23160.706908]  [] ?
> jbd2_journal_dirty_metadata+0xc5/0x260
> [23160.706914]  [] ? ext4_get_block_write+0x20/0x20
> [23160.706919]  [] __blockdev_direct_IO+0x4c/0x50
> [23160.706922]  [] ? ext4_get_block_write+0x20/0x20
> [23160.706928]  [] ext4_ind_direct_IO+0xce/0x410
> [23160.706931]  [] ? ext4_get_block_write+0x20/0x20
> [23160.706935]  [] ext4_ext_direct_IO+0x1bb/0x2a0
> [23160.706938]  [] ? __ext4_journal_stop+0x78/0xa0
> [23160.706942]  [] ext4_direct_IO+0xec/0x1e0
> [23160.706946]  [] ? __mark_inode_dirty+0x53/0x2d0
> [23160.706952]  [] generic_file_direct_write+0xbb/0x180
> [23160.706957]  [] ? mnt_clone_write+0x12/0x30
> [23160.706960]  [] __generic_file_write_iter+0x2a7/0x350
> [23160.706963]  [] ext4_file_write_iter+0x111/0x3d0
> [23160.706969]  [] ? iov_iter_init+0x14/0x40
> [23160.706976]  [] new_sync_write+0x7b/0xb0
> [23160.706978]  [] vfs_write+0xc7/0x1f0
> [23160.706980]  [] SyS_write+0x4f/0xb0
> [23160.706985]  [] system_call_fastpath+0x1a/0x1f
> [23280.705400] INFO: task dd:2033 blocked for more than 120 seconds.
> [23280.745358]   Tainted: G   OE 3.16.3-031603-generic
> #201409171435
> [23280.785069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [23280.864158] dd  D 000b 0  2033  23859
> 0x
> [23280.864164]  88011cec78c8 0082 88011cec78d8
> 88011cec7fd8
> [23280.864167]  000143c0 000143c0 88048661bcc0
> 880113441440
> [23280.864170]  88011cec7898 88067fd54cc0 880113441440
> 880113441440
> [23280.864173] Call Trace:
> [23280.864185]  [] schedule+0x29/0x70
> [23280.864197]  [] io_schedule+0x8f/0xd0
> [23280.864203]  [] dio_await_completion+0x54/0xd0
> [23280.864207]  [] do_blockdev_direct_IO+0x958/0xcc0
> [23280.864213]  [] ? wake_up_bit+0x2e/0x40
> [23280.864218]  [] ?
> jbd2_journal_dirty_metadata+0xc5/0x260
> [23280.864224]  [] ? ext4_get_block_write+0x20/0x20
> [23280.864229]  [] __blockdev_direct_IO+0x4c/0x50
> [23280.864239]  [] ? ext4_get_block_write+0x20/0x20
> [23280.864244]  [] ext4_ind_direct_IO+0xce/0x410
> [23280.864247]  [] ? ext4_get_block_write+0x20/0x20
> [23280.864251]  [] ext4_ext_direct_IO+0x1bb/0x2a0
> [23280.864254]  [] ? __ext4_journal_stop+0x78/0xa0
> [23280.864258]  [] ext4_direct_IO+0xec/0x1e0
> [23280.864263]  [] ? __mark_inode_dirty+0x53/0x2d0
> [23280.864268]  [] generic_file_direct_write+0xbb/0x180
> [23280.864273]  [] ? mnt_clone_write+0x12/0x30
> [23280.864284]  [] __generic_file_write_iter+0x2a7/0x350
> [23280.864289]  [] ext4_file_write_iter+0x111/0x3d0
> [23280.864295]  [] ? iov_iter_init+0x14/0x40
> [23280.864300]  [] new_sync_write+0x7b/0xb0
> [23280.864302]  [] vfs_write+0xc7/0x1f0
> [23280.864307]  [] SyS_write+0x4f/0xb0
> [23280.864314]  [] system_call_fastpath+0x1a/0x1f
> [23400.861043] INFO: task dd:2033 blocked for more than 120 seconds.
> [23400.901529]   Tainted: G   OE 3.16.3-031603-generic
> #201409171435
> [23400.942255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [23401.020985] dd  D 000b 0  2033  23859
> 0x
> [23401.020991]  88011cec78c8 0082 88011cec78d8
> 88011cec7fd8
> [23401.020995]  000143c0 000143c0 88048661bcc0
> 880113441440
> [23401.

Re: [ceph-users] [ceph-calamari] Setting up Ceph calamari :: Made Simple

2014-09-25 Thread Johan Kooijman
Karan,

Thanks for the tutorial, great stuff. Please note that in order to get the
graphs working, I had to install ipvsadm and create a symlink from
/sbin/ipvsadm to /usr/bin/ipvsadm (CentOS 6).

On Wed, Sep 24, 2014 at 10:16 AM, Karan Singh  wrote:

> Hello Cepher’s
>
> Now here comes my new blog on setting up Ceph Calamari.
>
> I hope you would like this step-by-step guide
>
> http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html
>
>
> - Karan -
>
>
> ___
> ceph-calamari mailing list
> ceph-calam...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com
>
>


-- 
Met vriendelijke groeten / With kind regards,
Johan Kooijman
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Ilya, 

I've not used rbd map on older kernels. Just experimenting with rbd map to have 
an iscsi and nfs gateway service for hypervisors such as xenserver and vmware. 
I've tried it with the latest ubuntu LTS kernel 3.13 I believe and noticed the 
issue. 
Can you not reproduce the hang tasks when doing dd testing? have you tried 4K 
block sizes and running it for sometime, like I have done? 

Thanks 

Andrei 

- Original Message -

> From: "Ilya Dryomov" 
> To: "Andrei Mikhailovsky" 
> Cc: "Micha Krause" , ceph-users@lists.ceph.com
> Sent: Thursday, 25 September, 2014 12:04:37 PM
> Subject: Re: [ceph-users] Frequent Crashes on rbd to nfs gateway
> Server

> On Thu, Sep 25, 2014 at 1:58 PM, Andrei Mikhailovsky
>  wrote:
> > Guys,
> >
> > Have done some testing with 3.16.3-031603-generic downloaded from
> > Ubuntu
> > utopic branch. The hang task problem is gone when using large block
> > size
> > (tested with 1M and 4M) and I could no longer preproduce the hang
> > tasks
> > while doing 100 dd tests in a for loop.
> >
> >
> >
> > However, I can confirm that I am still getting hang tasks while
> > working with
> > a 4K block size. The hang tasks start after about an hour, but they
> > do not
> > cause the server crash. After a while the dd test times out and
> > continues
> > with the loop. This is what I was running:
> >
> > for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K
> > count=25K
> > oflag=direct ; done
> >
> > The following test definately produces the hang tasks like these:
> >
> > [23160.549785] INFO: task dd:2033 blocked for more than 120
> > seconds.
> > [23160.588364] Tainted: G OE 3.16.3-031603-generic
> > #201409171435
> > [23160.627998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables
> > this message.
> > [23160.706856] dd D 000b 0 2033 23859
> > 0x
> > [23160.706861] 88011cec78c8 0082 88011cec78d8
> > 88011cec7fd8
> > [23160.706865] 000143c0 000143c0 88048661bcc0
> > 880113441440
> > [23160.706868] 88011cec7898 88067fd54cc0 880113441440
> > 880113441440
> > [23160.706871] Call Trace:
> > [23160.706883] [] schedule+0x29/0x70
> > [23160.706887] [] io_schedule+0x8f/0xd0
> > [23160.706893] [] dio_await_completion+0x54/0xd0
> > [23160.706897] []
> > do_blockdev_direct_IO+0x958/0xcc0
> > [23160.706903] [] ? wake_up_bit+0x2e/0x40
> > [23160.706908] [] ?
> > jbd2_journal_dirty_metadata+0xc5/0x260
> > [23160.706914] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23160.706919] [] __blockdev_direct_IO+0x4c/0x50
> > [23160.706922] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23160.706928] [] ext4_ind_direct_IO+0xce/0x410
> > [23160.706931] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23160.706935] [] ext4_ext_direct_IO+0x1bb/0x2a0
> > [23160.706938] [] ? __ext4_journal_stop+0x78/0xa0
> > [23160.706942] [] ext4_direct_IO+0xec/0x1e0
> > [23160.706946] [] ? __mark_inode_dirty+0x53/0x2d0
> > [23160.706952] []
> > generic_file_direct_write+0xbb/0x180
> > [23160.706957] [] ? mnt_clone_write+0x12/0x30
> > [23160.706960] []
> > __generic_file_write_iter+0x2a7/0x350
> > [23160.706963] []
> > ext4_file_write_iter+0x111/0x3d0
> > [23160.706969] [] ? iov_iter_init+0x14/0x40
> > [23160.706976] [] new_sync_write+0x7b/0xb0
> > [23160.706978] [] vfs_write+0xc7/0x1f0
> > [23160.706980] [] SyS_write+0x4f/0xb0
> > [23160.706985] [] system_call_fastpath+0x1a/0x1f
> > [23280.705400] INFO: task dd:2033 blocked for more than 120
> > seconds.
> > [23280.745358] Tainted: G OE 3.16.3-031603-generic
> > #201409171435
> > [23280.785069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables
> > this message.
> > [23280.864158] dd D 000b 0 2033 23859
> > 0x
> > [23280.864164] 88011cec78c8 0082 88011cec78d8
> > 88011cec7fd8
> > [23280.864167] 000143c0 000143c0 88048661bcc0
> > 880113441440
> > [23280.864170] 88011cec7898 88067fd54cc0 880113441440
> > 880113441440
> > [23280.864173] Call Trace:
> > [23280.864185] [] schedule+0x29/0x70
> > [23280.864197] [] io_schedule+0x8f/0xd0
> > [23280.864203] [] dio_await_completion+0x54/0xd0
> > [23280.864207] []
> > do_blockdev_direct_IO+0x958/0xcc0
> > [23280.864213] [] ? wake_up_bit+0x2e/0x40
> > [23280.864218] [] ?
> > jbd2_journal_dirty_metadata+0xc5/0x260
> > [23280.864224] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23280.864229] [] __blockdev_direct_IO+0x4c/0x50
> > [23280.864239] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23280.864244] [] ext4_ind_direct_IO+0xce/0x410
> > [23280.864247] [] ?
> > ext4_get_block_write+0x20/0x20
> > [23280.864251] [] ext4_ext_direct_IO+0x1bb/0x2a0
> > [23280.864254] [] ? __ext4_journal_stop+0x78/0xa0
> > [23280.864258] [] ext4_direct_IO+0xec/0x1e0
> > [23280.864263] [] ? __mark_inode_dirty+0x53/0x2d0
> > [23280.864268] []
> > generic_file_direct_write+0xbb/0x180
> > [23280.864273] [] ? mnt_clone_write+0x12/0x30
> > [23280.864284] []
> > _

[ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
v0.67.11 "Dumpling"
===

This stable update for Dumpling fixes several important bugs that affect a 
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If 
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052 
  Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted 
  (Sage Weil)
* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Ilya Dryomov
On Thu, Sep 25, 2014 at 7:06 PM, Andrei Mikhailovsky  wrote:
> Ilya,
>
> I've not used rbd map on older kernels. Just experimenting with rbd map to
> have an iscsi and nfs gateway service for hypervisors such as xenserver and
> vmware. I've tried it with the latest ubuntu LTS kernel 3.13 I believe and
> noticed the issue.
> Can you not reproduce the hang tasks when doing dd testing? have you tried
> 4K block sizes and running it for sometime, like I have done?

I forget which block size I tried, but it was one that you reported on
the tracker, I didn't make up my own.  I'll try it exactly the way you
described in your previous mail.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Icehouse & Ceph -- live migration fails?

2014-09-25 Thread Daniel Schneller
Hi!

We have an Icehouse system running with librbd based Cinder and Glance
configurations, storing images and volumes in Ceph.

Configuration is (apart from network setup details, of course) by the
book / OpenStack setup guide.

Works very nicely, including regular migration, but live migration of
virtual machines fails. I created a simple machine booting from a volume
based off the Ubuntu 14.04.1 cloud image for testing. 

Using Horizon, I can move this VM from host to host, but when I try to
Live Migrate it from one baremetal host to another, I get an error 
message “Failed to live migrate instance to host ’node02’".

The only related log entry I recognize is in the controller’s nova-api.log:


2014-09-25 17:15:47.679 3616 INFO nova.api.openstack.wsgi 
[req-f3dc3c2e-d366-40c5-a1f1-31db71afd87a f833f8e2d1104e66b9abe9923751dcf2 
a908a95a87cc42cd87ff97da4733c414] HTTP exception thrown: Compute service of 
node02.baremetal.clusterb.centerdevice.local is unavailable at this time.
2014-09-25 17:15:47.680 3616 INFO nova.osapi_compute.wsgi.server 
[req-f3dc3c2e-d366-40c5-a1f1-31db71afd87a f833f8e2d1104e66b9abe9923751dcf2 
a908a95a87cc42cd87ff97da4733c414] 10.102.6.8 "POST 
/v2/a908a95a87cc42cd87ff97da4733c414/servers/0f762f35-64ee-461f-baa4-30f5de4d5ddf/action
 HTTP/1.1" status: 400 len: 333 time: 0.1479030

I cannot see anything of value on the destination host itself.

New machines get scheduled there, so the compute service cannot really
be down.

In this thread Travis 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-March/019944.html 

describes a similar situation, however that was on Folsom, so I wonder if it
is still applicable.

Would be great to get some outside opinion :)

Thanks!
Daniel

-- 
Daniel Schneller
Mobile Development Lead
 
CenterDevice GmbH  | Merscheider Straße 1
   | 42699 Solingen
tel: +49 1754155711| Deutschland
daniel.schnel...@centerdevice.com  | www.centerdevice.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Mike Dawson

On 9/25/2014 11:09 AM, Sage Weil wrote:

v0.67.11 "Dumpling"
===

This stable update for Dumpling fixes several important bugs that affect a
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052
   Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted
   (Sage Weil)


Sage,

Thanks for the great work! Could you provide any links describing how to 
tune the scrub and snap trim thread pool IO priority? I couldn't find 
these settings in the docs.


IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
#9503, right?


Thanks,
Mike Dawson



* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
On Thu, 25 Sep 2014, Mike Dawson wrote:
> On 9/25/2014 11:09 AM, Sage Weil wrote:
> > v0.67.11 "Dumpling"
> > ===
> > 
> > This stable update for Dumpling fixes several important bugs that affect a
> > small set of users.
> > 
> > We recommend that all Dumpling users upgrade at their convenience.  If
> > none of these issues are affecting your deployment there is no urgency.
> > 
> > 
> > Notable Changes
> > ---
> > 
> > * common: fix sending dup cluster log items (#9080 Sage Weil)
> > * doc: several doc updates (Alfredo Deza)
> > * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
> > * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
> > * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
> > * librbd: fix error path cleanup when failing to open image (#8912 Josh
> > Durgin)
> > * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
> >Sage Weil)
> > * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
> > * osd: allow scrub and snap trim thread pool IO priority to be adjusted
> >(Sage Weil)
> 
> Sage,
> 
> Thanks for the great work! Could you provide any links describing how to tune
> the scrub and snap trim thread pool IO priority? I couldn't find these
> settings in the docs.

It's 

 osd disk thread ioprio class = idle
 osd disk thread ioprio priority = 0

Note that this is a short-term solution; we eventaully want to send all IO 
through the same queue so that we can prioritize things more carefully.  
This setting will most likely go away in the future.

> IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
> #9503, right?

Correct.  That will come later once it's gone through more testing.

Thanks!
sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Dan Van Der Ster
Hi Mike,

> On 25 Sep 2014, at 17:47, Mike Dawson  wrote:
> 
> On 9/25/2014 11:09 AM, Sage Weil wrote:
>> v0.67.11 "Dumpling"
>> ===
>> 
>> This stable update for Dumpling fixes several important bugs that affect a
>> small set of users.
>> 
>> We recommend that all Dumpling users upgrade at their convenience.  If
>> none of these issues are affecting your deployment there is no urgency.
>> 
>> 
>> Notable Changes
>> ---
>> 
>> * common: fix sending dup cluster log items (#9080 Sage Weil)
>> * doc: several doc updates (Alfredo Deza)
>> * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
>> * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
>> * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
>> * librbd: fix error path cleanup when failing to open image (#8912 Josh 
>> Durgin)
>> * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
>>   Sage Weil)
>> * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
>> * osd: allow scrub and snap trim thread pool IO priority to be adjusted
>>   (Sage Weil)
> 
> Sage,
> 
> Thanks for the great work! Could you provide any links describing how to tune 
> the scrub and snap trim thread pool IO priority? I couldn't find these 
> settings in the docs.

I use:

[osd]
  osd disk thread ioprio class = 3
  osd disk thread ioprio priority = 0

You’ll need to use the cfq io scheduler for those to have an effect.

FYI, I can make scrubs generally transparent by also adding:

  osd scrub sleep = .1
  osd scrub chunk max = 5
  osd deep scrub stride = 1048576

Your mileage may vary.

> IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
> #9503, right?

Those didn’t make it.

Cheers, Dan



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-maintainers] v0.67.11 dumpling released

2014-09-25 Thread Loic Dachary
Hi,

On 25/09/2014 17:53, Sage Weil wrote:
> On Thu, 25 Sep 2014, Mike Dawson wrote:
>> On 9/25/2014 11:09 AM, Sage Weil wrote:
>>> v0.67.11 "Dumpling"
>>> ===
>>>
>>> This stable update for Dumpling fixes several important bugs that affect a
>>> small set of users.
>>>
>>> We recommend that all Dumpling users upgrade at their convenience.  If
>>> none of these issues are affecting your deployment there is no urgency.
>>>
>>>
>>> Notable Changes
>>> ---
>>>
>>> * common: fix sending dup cluster log items (#9080 Sage Weil)
>>> * doc: several doc updates (Alfredo Deza)
>>> * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
>>> * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
>>> * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
>>> * librbd: fix error path cleanup when failing to open image (#8912 Josh
>>> Durgin)
>>> * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
>>>Sage Weil)
>>> * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
>>> * osd: allow scrub and snap trim thread pool IO priority to be adjusted
>>>(Sage Weil)
>>
>> Sage,
>>
>> Thanks for the great work! Could you provide any links describing how to tune
>> the scrub and snap trim thread pool IO priority? I couldn't find these
>> settings in the docs.
> 
> It's 
> 
>  osd disk thread ioprio class = idle
>  osd disk thread ioprio priority = 0
> 
> Note that this is a short-term solution; we eventaully want to send all IO 
> through the same queue so that we can prioritize things more carefully.  
> This setting will most likely go away in the future.
> 

The documentation for these can be found at

http://ceph.com/docs/giant/rados/configuration/osd-config-ref/#operations

Control-f ioprio

Cheers

>> IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
>> #9503, right?
> 
> Correct.  That will come later once it's gone through more testing.
> 
> Thanks!
> sage
> ___
> Ceph-maintainers mailing list
> ceph-maintain...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-maintainers-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
On Thu, 25 Sep 2014, Dan Van Der Ster wrote:
> Hi Mike,
> 
> > On 25 Sep 2014, at 17:47, Mike Dawson  wrote:
> > 
> > On 9/25/2014 11:09 AM, Sage Weil wrote:
> >> v0.67.11 "Dumpling"
> >> ===
> >> 
> >> This stable update for Dumpling fixes several important bugs that affect a
> >> small set of users.
> >> 
> >> We recommend that all Dumpling users upgrade at their convenience.  If
> >> none of these issues are affecting your deployment there is no urgency.
> >> 
> >> 
> >> Notable Changes
> >> ---
> >> 
> >> * common: fix sending dup cluster log items (#9080 Sage Weil)
> >> * doc: several doc updates (Alfredo Deza)
> >> * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
> >> * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
> >> * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
> >> * librbd: fix error path cleanup when failing to open image (#8912 Josh 
> >> Durgin)
> >> * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
> >>   Sage Weil)
> >> * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
> >> * osd: allow scrub and snap trim thread pool IO priority to be adjusted
> >>   (Sage Weil)
> > 
> > Sage,
> > 
> > Thanks for the great work! Could you provide any links describing how to 
> > tune the scrub and snap trim thread pool IO priority? I couldn't find these 
> > settings in the docs.
> 
> I use:
> 
> [osd]
>   osd disk thread ioprio class = 3

Sigh.. it looks like the version that went into master and firefly uses 
the string names for classes while the dumpling patch takes the numeric 
ID.  Oops.  You'll need to take some care to adjust this setting when you 
upgrade.

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [ceph-calamari] Setting up Ceph calamari :: Made Simple

2014-09-25 Thread Dan Mick
Can you explain this a little more, Johan?  I've never even heard of
ipvsadmin or its facilities before today, and it ought not be required...
On Sep 25, 2014 7:04 AM, "Johan Kooijman"  wrote:

> Karan,
>
> Thanks for the tutorial, great stuff. Please note that in order to get the
> graphs working, I had to install ipvsadm and create a symlink from
> /sbin/ipvsadm to /usr/bin/ipvsadm (CentOS 6).
>
> On Wed, Sep 24, 2014 at 10:16 AM, Karan Singh  wrote:
>
>> Hello Cepher’s
>>
>> Now here comes my new blog on setting up Ceph Calamari.
>>
>> I hope you would like this step-by-step guide
>>
>> http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html
>>
>>
>> - Karan -
>>
>>
>> ___
>> ceph-calamari mailing list
>> ceph-calam...@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com
>>
>>
>
>
> --
> Met vriendelijke groeten / With kind regards,
> Johan Kooijman
>
> ___
> ceph-calamari mailing list
> ceph-calam...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Gregory Farnum
I imagine you aren't actually using the data/metadata pool that these
PGs are in, but it's a previously-reported bug we haven't identified:
http://tracker.ceph.com/issues/8758
They should go away if you restart the OSDs that host them (or just
remove those pools), but it's not going to hurt anything as long as
you aren't using them.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Sep 25, 2014 at 3:37 AM, Pavel V. Kaygorodov  wrote:
> Hi!
>
> 16 pgs in our ceph cluster are in active+clean+replay state more then one day.
> All clients are working fine.
> Is this ok?
>
> root@bastet-mon1:/# ceph -w
> cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
>  health HEALTH_OK
>  monmap e3: 3 mons at 
> {1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election epoch 
> 2570, quorum 0,1,2 1,2,3
>  osdmap e3108: 16 osds: 16 up, 16 in
>   pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
> 2066 GB used, 10879 GB / 12945 GB avail
> 8688 active+clean
>   16 active+clean+replay
>   client io 3237 kB/s wr, 68 op/s
>
>
> root@bastet-mon1:/# ceph pg dump | grep replay
> dumped all in format plain
> 0.fd0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628 
>   [0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 02:23:49.463704  
> 0'0 2014-09-23 02:23:49.463704
> 0.e80   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823 
>   [2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 14:37:32.910787  
> 0'0 2014-09-22 14:37:32.910787
> 0.aa0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451 
>   [0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 00:39:10.717363  
> 0'0 2014-09-23 00:39:10.717363
> 0.9c0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917 
>   [0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 14:40:06.694479  
> 0'0 2014-09-22 14:40:06.694479
> 0.9a0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486 
>   [0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 01:14:55.825900  
> 0'0 2014-09-23 01:14:55.825900
> 0.910   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962 
>   [0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 14:37:44.652796  
> 0'0 2014-09-22 14:37:44.652796
> 0.8c0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635 
>   [0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 01:52:52.390529  
> 0'0 2014-09-23 01:52:52.390529
> 0.8b0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636 
>   [2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 01:31:38.134466  
> 0'0 2014-09-23 01:31:38.134466
> 0.500   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801 
>   [7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 08:38:53.963779  
> 0'0 2014-09-13 10:27:26.977929
> 0.440   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819 
>   [7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 11:59:05.208164  
> 0'0 2014-09-20 11:59:05.208164
> 0.390   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827 
>   [0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 14:40:50.697850  
> 0'0 2014-09-22 14:40:50.697850
> 0.320   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719 
>   [2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 12:06:23.716480  
> 0'0 2014-09-20 12:06:23.716480
> 0.2c0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540 
>   [0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 23:44:53.387815  
> 0'0 2014-09-22 23:44:53.387815
> 0.1f0   0   0   0   0   0   0   
> active+clean+replay 2014-09-24 02:38:28.651059  0'0 3108:2522 
>   [0,2,14,11][0,2,14,11]  0   0'0 2014-09-22 23:38:16.315755  
> 0'0 2014-09-22 23:38:16.315755
> 0.7 0 

Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Mike Dawson
Looks like the packages have partially hit the repo, but at least the 
following are missing:


Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/librbd1_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/librados2_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/python-ceph_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/ceph_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/libcephfs1_0.67.11-1precise_amd64.deb 
 404  Not Found


Based on the timestamps of the files that made it, it looks like the 
process to publish the packages isn't still in process, but rather 
failed yesterday.


Thanks,
Mike Dawson


On 9/25/2014 11:09 AM, Sage Weil wrote:

v0.67.11 "Dumpling"
===

This stable update for Dumpling fixes several important bugs that affect a
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052
   Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted
   (Sage Weil)
* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Alfredo Deza
On Thu, Sep 25, 2014 at 1:27 PM, Mike Dawson  wrote:
> Looks like the packages have partially hit the repo, but at least the
> following are missing:
>
> Failed to fetch
> http://ceph.com/debian-dumpling/pool/main/c/ceph/librbd1_0.67.11-1precise_amd64.deb
> 404  Not Found
> Failed to fetch
> http://ceph.com/debian-dumpling/pool/main/c/ceph/librados2_0.67.11-1precise_amd64.deb
> 404  Not Found
> Failed to fetch
> http://ceph.com/debian-dumpling/pool/main/c/ceph/python-ceph_0.67.11-1precise_amd64.deb
> 404  Not Found
> Failed to fetch
> http://ceph.com/debian-dumpling/pool/main/c/ceph/ceph_0.67.11-1precise_amd64.deb
> 404  Not Found
> Failed to fetch
> http://ceph.com/debian-dumpling/pool/main/c/ceph/libcephfs1_0.67.11-1precise_amd64.deb
> 404  Not Found
>
> Based on the timestamps of the files that made it, it looks like the process
> to publish the packages isn't still in process, but rather failed yesterday.

That is odd. I just went ahead and re-pushed the packages and they are
now showing up.

Thanks for letting us know!


>
> Thanks,
> Mike Dawson
>
>
> On 9/25/2014 11:09 AM, Sage Weil wrote:
>>
>> v0.67.11 "Dumpling"
>> ===
>>
>> This stable update for Dumpling fixes several important bugs that affect a
>> small set of users.
>>
>> We recommend that all Dumpling users upgrade at their convenience.  If
>> none of these issues are affecting your deployment there is no urgency.
>>
>>
>> Notable Changes
>> ---
>>
>> * common: fix sending dup cluster log items (#9080 Sage Weil)
>> * doc: several doc updates (Alfredo Deza)
>> * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
>> * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage
>> Weil)
>> * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
>> * librbd: fix error path cleanup when failing to open image (#8912 Josh
>> Durgin)
>> * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
>>Sage Weil)
>> * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
>> * osd: allow scrub and snap trim thread pool IO priority to be adjusted
>>(Sage Weil)
>> * osd: fix mount/remount sync race (#9144 Sage Weil)
>>
>> Getting Ceph
>> 
>>
>> * Git at git://github.com/ceph/ceph.git
>> * Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
>> * For packages, see http://ceph.com/docs/master/install/get-packages
>> * For ceph-deploy, see
>> http://ceph.com/docs/master/install/install-ceph-deploy
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Pavel V. Kaygorodov
Hi!

> I imagine you aren't actually using the data/metadata pool that these
> PGs are in, but it's a previously-reported bug we haven't identified:
> http://tracker.ceph.com/issues/8758
> They should go away if you restart the OSDs that host them (or just
> remove those pools), but it's not going to hurt anything as long as
> you aren't using them.

Thanks a lot, restarting of osds helps!
BTW, I tried to delete data and metadata pools just after setup, but ceph 
refused me to do this.

With best regards,
  Pavel.



> On Thu, Sep 25, 2014 at 3:37 AM, Pavel V. Kaygorodov  wrote:
>> Hi!
>> 
>> 16 pgs in our ceph cluster are in active+clean+replay state more then one 
>> day.
>> All clients are working fine.
>> Is this ok?
>> 
>> root@bastet-mon1:/# ceph -w
>>cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
>> health HEALTH_OK
>> monmap e3: 3 mons at 
>> {1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election 
>> epoch 2570, quorum 0,1,2 1,2,3
>> osdmap e3108: 16 osds: 16 up, 16 in
>>  pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
>>2066 GB used, 10879 GB / 12945 GB avail
>>8688 active+clean
>>  16 active+clean+replay
>>  client io 3237 kB/s wr, 68 op/s
>> 
>> 
>> root@bastet-mon1:/# ceph pg dump | grep replay
>> dumped all in format plain
>> 0.fd0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628
>>[0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 
>> 02:23:49.463704  0'0 2014-09-23 02:23:49.463704
>> 0.e80   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823
>>[2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 
>> 14:37:32.910787  0'0 2014-09-22 14:37:32.910787
>> 0.aa0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451
>>[0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 
>> 00:39:10.717363  0'0 2014-09-23 00:39:10.717363
>> 0.9c0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917
>>[0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 
>> 14:40:06.694479  0'0 2014-09-22 14:40:06.694479
>> 0.9a0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486
>>[0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 
>> 01:14:55.825900  0'0 2014-09-23 01:14:55.825900
>> 0.910   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962
>>[0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 
>> 14:37:44.652796  0'0 2014-09-22 14:37:44.652796
>> 0.8c0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635
>>[0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 
>> 01:52:52.390529  0'0 2014-09-23 01:52:52.390529
>> 0.8b0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636
>>[2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 
>> 01:31:38.134466  0'0 2014-09-23 01:31:38.134466
>> 0.500   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801
>>[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 
>> 08:38:53.963779  0'0 2014-09-13 10:27:26.977929
>> 0.440   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819
>>[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 
>> 11:59:05.208164  0'0 2014-09-20 11:59:05.208164
>> 0.390   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827
>>[0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 
>> 14:40:50.697850  0'0 2014-09-22 14:40:50.697850
>> 0.320   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719
>>[2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 
>> 12:06:23.716480  0'0 2014-09-20 12:06:23.716480
>> 0.2c0   0   0   0   0   0   0   
>> active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540
>>[0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 
>> 23:44:53.387815  0'0 2014-09-22 23:44:53.387815
>> 0.1f0   0   0   0   0   0   0   
>> active+clean+repl

Re: [ceph-users] Any way to remove possible orphaned files in a federated gateway configuration

2014-09-25 Thread Lyn Mitchell
Thanks Yehuda for your response, much appreciated.

Using the "radosgw-admin object stat" option I was able to reconcile the 
objects on master and slave.  There are 10 objects on the master that have 
replicated to the slave, for these 10 objects I was able to confirm by pulling 
the tag prefix from "object stat", verifying size, name, etc.  There are still 
a large number of "shadow" files in .region-1.zone-2.rgw.buckets pool which 
have no corresponding object to cross reference using "object stat" command.  
These files are taking up several hundred GB from OSD's on the region-2 
cluster.  What would be the correct way to remove these "shadow" files that no 
longer have objects associated?  Is there a process that will clean these 
orphaned objects?  Any steps anyone can provide to remove these files would 
greatly appreciated.

BTW - Since my original post several objects have been copied via s3 client to 
the master and everything appears to be replicating without issue.  Objects 
have been deleted as well, the sync looks fine, objects are being removed from 
master and slave.  I'm pretty sure the large number of orphaned "shadow" files 
that are currently in the .region-1.zone-2.rgw.buckets pool are from the 
original sync performed back on Sept. 15.

Thanks in advance,
MLM

-Original Message-
From: yehud...@gmail.com [mailto:yehud...@gmail.com] On Behalf Of Yehuda Sadeh
Sent: Tuesday, September 23, 2014 5:30 PM
To: lyn_mitch...@bellsouth.net
Cc: ceph-users; ceph-commun...@lists.ceph.com
Subject: Re: [ceph-users] Any way to remove possible orphaned files in a 
federated gateway configuration

On Tue, Sep 23, 2014 at 3:05 PM, Lyn Mitchell  wrote:
> Is anyone aware of a way to either reconcile or remove possible 
> orphaned “shadow” files in a federated gateway configuration?  The 
> issue we’re seeing is the number of chunks/shadow files on the slave has many 
> more “shadow”
> files than the master, the breakdown is as follows:
>
> master zone:
>
> .region-1.zone-1.rgw.buckets = 1737 “shadow” files of which there are 
> 10 distinct sets of tags, an example of 1 distinct set is:
>
> alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_1 through
> alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_516
>
>
>
> slave zone:
>
> .region-1.zone-2.rgw.buckets = 331961 “shadow” files, of which there 
> are 652 distinct sets of  tags, examples:
>
> 1 set having 516 “shadow” files:
>
> alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_1 through
> alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_516
>
>
>
> 236 sets having 515 “shadow” files apiece:
>
> alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_1 through
> alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_515
>
> alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_1 through
> alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_515

These are all part of the same bucket (prefixed by alph-1.80907.1).

>
> ….
>
>
>
> The number of shadow files in zone-2 is taking quite a bit of space from the
> OSD’s in the cluster.   Without being able to trace back to the original
> file name from an s3 or rados tag, I have no way of knowing which 
> files these are.  Is it possible that the same file may have been 
> replicated multiple times, due to network or connectivity issues?
>
>
>
> I can provide any logs or other information that may provide some 
> help, however at this point we’re not seeing any real errors.
>
>
>
> Thanks in advance for any help that can be provided,

You can also run the following command on the existing objects within that 
specific bucket:

$ radosgw-admin object stat --bucket= --object=

This will show the mapping from the rgw object to the rados objects that 
construct it.


Yehuda

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD import slow

2014-09-25 Thread Josh Durgin

On 09/24/2014 04:57 PM, Brian Rak wrote:

I've been doing some testing of importing virtual machine images, and
I've found that 'rbd import' is at least 2x as slow as 'qemu-img
convert'.  Is there anything I can do to speed this process up?  I'd
like to use rbd import because it gives me a little additional flexibility.

My test setup was a 40960MB LVM volume, and I used the following two
commands:

rbd import /dev/lvmtest/testvol test
qemu-img convert /dev/lvmtest/testvol rbd:test/test

rbd import took 13 minutes, qemu-img took 5.

I'm at a loss to explain this, I would have expected rbd import to be
faster.

This is with ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)


rbd import was doing one synchronous I/O after another. Recently import
and export were parallelized according to 
--rbd-concurrent-management-ops (default 10), which helps quite a bit. 
This will be in

giant.

Josh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Best practice about using multiple disks on one single OSD

2014-09-25 Thread James Pan
Hi,

I have several servers and each server has 4 disks.
Now I am going to setup Ceph on these servers and use all the 4 disks but it 
seems one OSD instance can be configured with one backend storage. 

So there seems two options to me:

1. Make the 4 disks into a raid0 then setup OSD to use this raid0 but obviously 
this is not good because one disk failure will ruin the entire storage.
2. Build FS on each disk and start 4 OSD instances on the server.

Both options are not good. So I am wondering what's the best practice of 
setting up multiple didks on one OSD for Ceph.


Thanks!
Best Regards,



James Jiaming Pan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Best practice about using multiple disks on one single OSD

2014-09-25 Thread Jean-Charles LOPEZ
Hi James,

the best practice is to set up 1 OSD daemon per physical disk drive.

In your case, each OSD node would hence be 4 OSD daemons using one physical 
drive per daemon, and deploying a minimum of 3 servers so each object copy 
resides on a separate physical server.

JC



On Sep 25, 2014, at 20:42, James Pan  wrote:

> Hi,
> 
> I have several servers and each server has 4 disks.
> Now I am going to setup Ceph on these servers and use all the 4 disks but it 
> seems one OSD instance can be configured with one backend storage. 
> 
> So there seems two options to me:
> 
> 1. Make the 4 disks into a raid0 then setup OSD to use this raid0 but 
> obviously this is not good because one disk failure will ruin the entire 
> storage.
> 2. Build FS on each disk and start 4 OSD instances on the server.
> 
> Both options are not good. So I am wondering what's the best practice of 
> setting up multiple didks on one OSD for Ceph.
> 
> 
> Thanks!
> Best Regards,
> 
> 
> 
> James Jiaming Pan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] iptables

2014-09-25 Thread shiva rkreddy
Hello,
On my ceph cluster osd node . there is a rule to REJECT all.
As per the documentation, added a rule to allow the trafficon the full
range of ports,
But, the cluster will not come into clean state. Can you please share your
experience with the iptables configuration.

Following are the INPUT rules:

5ACCEPT tcp  --  10.108.240.192/260.0.0.0/0   multiport
dports 6800:7100
6REJECT all  --  0.0.0.0/00.0.0.0/0
reject-with icmp-host-prohibited

Thanks,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Best practice about using multiple disks on one single OSD

2014-09-25 Thread James Pan
Thank you JC.
 
Best Regards,




James Jiaming Pan



On Friday, September 26, 2014 12:25 PM, Jean-Charles LOPEZ 
 wrote:
Hi James,

the best practice is to set up 1 OSD daemon per physical disk drive.

In your case, each OSD node would hence be 4 OSD daemons using one physical 
drive per daemon, and deploying a minimum of 3 servers so each object copy 
resides on a separate physical server.

JC




On Sep 25, 2014, at 20:42, James Pan  wrote:

> Hi,
> 
> I have several servers and each server has 4 disks.
> Now I am going to setup Ceph on these servers and use all the 4 disks but it 
> seems one OSD instance can be configured with one backend storage. 
> 
> So there seems two options to me:
> 
> 1. Make the 4 disks into a raid0 then setup OSD to use this raid0 but 
> obviously this is not good because one disk failure will ruin the entire 
> storage.
> 2. Build FS on each disk and start 4 OSD instances on the server.
> 
> Both options are not good. So I am wondering what's the best practice of 
> setting up multiple didks on one OSD for Ceph.
> 
> 
> Thanks!
> Best Regards,
> 
> 
> 
> James Jiaming Pan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com