[ceph-users] IFFD_FINANCIAL_DONATION

2013-03-20 Thread FINANCIAL PAYMENT COMMITTEE
IFFD - INTERNATIONAL FUND FOR DEVELOPMENT
NOBEL HOUSE, 17 SMITH SQUARE
LONDON SW1P 3JR UNITED KINGDOM
Tel: 00448719157181
developmentf...@ymail.com
developmentf...@europemail.com

IFFD is an International Financial Institution and a Specialized Agency of the 
United Nations whose mission is to enable poor rural people to overcome 
poverty. IFFD's headquarters is in London, United Kingdom, and its staff work 
with poor rural people, their governments, communities and organizations in 
more than 80 countries to develop and finance programmes and projects aimed at 
increasing rural productivity and incomes. The Programme Management Department 
(PMD) is responsible for the overall lending programme of the Fund, and is 
composed of five regional divisions and a Policy and Technical Advisory 
Services Division.

Within the Latin America, Caribbean Division and Asia Continent the Senior 
Rural Development Specialist works with staff to increase the comparative 
advantage of IFFD in the region by Providing timely and authoritative 
analytical and advisory capacity to staff in appropriate forms to enhance the 
quality, relevance and effectiveness of LAC's work and instruments (Country 
Strategic Opportunities Programmes [COSOPs], projects, grants, policy dialogue, 
knowledge management, South-South Cooperation, portfolio review and M&E). 
Developing and implementing strategies, systems and approaches to better 
integrate the different types of operations, allowing IFFD to enhance its 
value-addition capacity, its scale of operations, and its impact on poverty 
reduction in the region. Therefore, the IFFD is welcoming grantee applications 
from around the globe to funding individuals, organizations, industries and 
institutional bodies and other related sections to reduce the poverty menace 
globally
  for which each application is allowed to  receive funds not more than One 
Million United State Dollar (1,000,000.00USD).

kindly fill out the below form.

Individual /organization full 
name...
2. 
Address.City..State.Country...
3. Marital status if individual...
4. Occupation.
5. Age
6. Contact phone number...
7. Email Address..
8. BANK NAME…
9. BANK ADDRESS..
10. BANK ACCOUNT NUMBER…….
11. IFSC CODE…

developmentf...@ymail.com developmentf...@europemail.com
Regards
David Brook
IFFD Co-ordinator
Tel: 00448719157181


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] waiting for 1 open ops to drain

2013-03-20 Thread Dave (Bob)
I am using ceph 0.58 and kernel 3.9-rc2 and btrfs on my osds.

I have an osd that starts up but blocks with the log message 'waiting
for 1 open ops to drain'.

This never happens, and I can't get the osd 'up'.

I need to clear this problem. I have recently had an osd go problematic
and I have recreated a fresh btrfs filesystem on the problem osd drive.
I have also added a completely new osd.

The 'waiting for 1 open ops to drain' problem has occurred before the
cluster has recovered from the earlier surgery and I need to get the
data from this osd.

I have increased the number of copies from 2 to 3 to give me more
resilience in the future, but that has not taken effect yet.

Once I get the cluster back to health, I will mkfs.btrfs and rebuild
this osd and one other that is a legacy from earlier kernel/ceph versions.

How can I tell the osd not to bother with waiting for its open ops to drain?

Thank you in anticipation.

David


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] create volume from an image

2013-03-20 Thread Chen, Xiaoxi
Thanks josh,the problem is solved by updating ceph in the glance node.

发自我的 iPhone

在 2013-3-20,14:59,"Josh Durgin"  写道:

> On 03/19/2013 11:03 PM, Chen, Xiaoxi wrote:
>> I think Josh may be the right man for this question ☺
>> 
>> To be more precious, I would like to add more words about the status:
>> 
>> 1. We have configured “show_image_direct_url= Ture” in Glance, and from the 
>> Cinder-volume’s log, we can make sure we have got a direct_url , for example.
>> image_id 6565d775-553b-41b6-9d5e-ddb825677706
>> image_location rbd://6565d775-553b-41b6-9d5e-ddb825677706
>> 2. In the _is_cloneable function, it tries to “_parse_location” the 
>> direct_url (rbd://6565d775-553b-41b6-9d5e-ddb825677706) into 4 parts : 
>> fsid,pool,volume,snapshot . Since the direct_url passed from Glance doesn’t 
>> provide fsid ,pool and snapshot info, the parse is failed and _is_cloneable 
>> return false, which will finally drop the request to 
>> RBDDriver::copy_image_to_volume.
> 
> This is working as expected - cloning was introduced in format 2 rbd
> volumes, available in bobtail but not argonaut. When the image is
> uploaded to glance, it is created as format 2 and a snapshot of it is
> taken and protected from deletion if the installed version of librbd
> (via the python-ceph package) supports it.
> 
> The location reported will be just the image id for format 1 images.
> For format 2 images, it has 4 parts, as you noted. You may need to
> update python-ceph and librbd1 on the node running glance-api
> and re-upload the image so it will be created as format 2, rather
> than the current image which is format 1, and thus cannot be cloned.
> 
>> 3. In  Cinder/volume/driver.py,RBDDriver::copy_image_to_volume, we have seem 
>> this note:
>>  # TODO(jdurgin): replace with librbd  this is a temporary hack, since 
>> rewriting this driver to use librbd would take too long
>>  And in this function, the cinder RBD driver download the whole 
>> image from Glance into a temp file in local Filesystem, then use rbd import 
>> to import the temp file into a RBD volume.
> 
> That note is about needing to remove the volume before importing the
> data, instead of just writing to it directly with librbd.
>
>>This is absolutely not what we want (zero copy and CoW), so we 
>> digging into the _is_cloneable function
>> 
>> Seems the straightforward way to solve 2 )  write a patch for glance that 
>> adding more infos in the direct_url, but I am not sure if it’s possible for 
>> ceph to clone a RBD from pool A to pool B?
> 
> Cloning from one pool to another is certainly supported. If you're
> interested in more details about cloning, check out the command line
> usage [1] and internal design [2].
> 
> Josh
> 
> [1] https://github.com/ceph/ceph/blob/master/doc/rbd/rbd-snapshot.rst#layering
> [2] https://github.com/ceph/ceph/blob/master/doc/dev/rbd-layering.rst
> 
>> From: ceph-users-boun...@lists.ceph.com 
>> [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Li, Chen
>> Sent: 2013年3月20日 12:57
>> To: 'ceph-users@lists.ceph.com'
>> Subject: [ceph-users] create volume from an image
>> 
>> I'm using Ceph RBD for both Cinder and Glance. Cinder and Glance are 
>> installed in two machines.
>> I have get information from many place that when cinder and glance both 
>> using Ceph RBD, then no real data transmit will happen because of copy on 
>> write.
>> But the truth is when i run the command:
>> cinder create --image-id 6565d775-553b-41b6-9d5e-ddb825677706 --display-name 
>> test 3
>> I can still get network data traffic between cinder and glance.
>> And I check the cinder code, the image_location is None 
>> (cinder/volume/manager.py), which makes cinder will failed running cloned = 
>> self.driver.clone_image(volume_ref, image_location).
>> Is this a OpenStack (cinder or glance )bug  ?
>> Or I have miss and configuration?
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Replacement hardware

2013-03-20 Thread Igor Laskovy
Hi there!

What steps needs to be perform if we have totally lost a node.
As I already understand from docs, OSDs must be recreated (disabled,
removed and again created, right?)
But what about MON and MDS?

-- 
Igor Laskovy
facebook.com/igor.laskovy
Kiev, Ukraine
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Calculate and increase pg_num

2013-03-20 Thread Marco Aroldi
Dan, Sebastién,
thanks for the hints.

For Inktank:
There's no doubt that setting the appropriate pg_num is a very
important parameter to set, and due to the lack of a stable command to
increase them, the workaround by Sébastien Han along with the advises
by Dan van der Ster should be included in the documentation

--
Marco Aroldi


2013/3/15 Dan van der Ster :
> On Fri, Mar 15, 2013 at 4:44 PM, Marco Aroldi  wrote:
>> Dan,
>> this sound weird:
>> how can you run "cephfs /mnt/mycephfs set_layout 10" on a unmounted 
>> mountpoint?
>
> We had cephfs still mounted from earlier (before the copy pool, delete
> pool). Basically, any file reads resulted in a I/O error, but
> nevertheless the set_layout worked. We didn't try mounting cephfs
> again when the root pool didn't exist.
>
> Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Replacement hardware

2013-03-20 Thread Dave (Bob)
Igor,

I am sure that I'm right in saying that you just have to create a new
filesystem (btrfs?) on the new block device, mount it, and then
initialise the osd with:

ceph-osd -i  --mkfs

Then you can start the osd with:

ceph-osd -i 

Since you are replacing an osd that already existed, the cluster knows
about it, and there is a key for it that is known.

I don't claim any great expertise, but this is what I've been doing, and
the cluster seems to adopt the new osd and sort everything out.

David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] pgs stuck unclean

2013-03-20 Thread Gergely Pál - night[w]

Hello!

I've deployed a test ceph cluster according to this guide: 
http://ceph.com/docs/master/start/quick-start/


The problem is that the cluster will never go to a clean state by itself.

The corresponding outputs are the following:
root@test-4:~# ceph health
HEALTH_WARN 3 pgs degraded; 38 pgs stuck unclean; recovery 2/44 degraded 
(4.545%)




root@test-4:~# ceph -s
   health HEALTH_WARN 3 pgs degraded; 38 pgs stuck unclean; recovery 
2/44 degraded (4.545%)

   monmap e1: 1 mons at {a=10.0.0.3:6789/0}, election epoch 1, quorum 0 a
   osdmap e45: 2 osds: 2 up, 2 in
pgmap v344: 384 pgs: 346 active+clean, 35 active+remapped, 3 
active+degraded; 6387 KB data, 2025 MB used, 193 GB / 200 GB avail; 2/44 
degraded (4.545%)

   mdsmap e29: 1/1/1 up {0=a=up:active}



root@test-4:~# ceph pg dump_stuck unclean
ok
pg_stat	objects	mip	degr	unf	bytes	log	disklog	state	state_stamp	v 
reported	up	acting	last_scrub	scrub_stamp	last_deep_scrub	deep_scrub_stamp
1.6b	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.056155	0'0 
36'45	[0]	[0,1]	0'0	2013-03-20 16:50:19.699765	0'0	2013-03-20 
16:50:19.699765
2.6a	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.062933	0'0 
36'45	[0]	[0,1]	0'0	2013-03-20 16:53:22.749668	0'0	2013-03-20 
16:53:22.749668
0.62	0	0	0	0	0	2584	2584	active+remapped	2013-03-20 17:15:10.953654 
17'19	39'63	[1]	[1,0]	11'12	2013-03-20 16:46:48.646752	11'12	2013-03-20 
16:46:48.646752
2.60	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.331682	0'0 
39'47	[1]	[1,0]	0'0	2013-03-20 16:53:04.744990	0'0	2013-03-20 
16:53:04.744990
1.61	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.345445	0'0 
39'47	[1]	[1,0]	0'0	2013-03-20 16:49:58.694300	0'0	2013-03-20 
16:49:58.694300
2.45	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.179279	0'0 
39'75	[1]	[1,0]	0'0	2013-03-20 16:49:43.649700	0'0	2013-03-20 
16:49:43.649700
1.46	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.179239	0'0 
39'75	[1]	[1,0]	0'0	2013-03-20 16:47:10.610772	0'0	2013-03-20 
16:47:10.610772
0.47	0	0	0	0	0	3808	3808	active+remapped	2013-03-20 17:15:10.953601 
17'28	39'93	[1]	[1,0]	11'19	2013-03-20 16:44:31.572090	11'19	2013-03-20 
16:44:31.572090
0.3c	0	0	0	0	0	3128	3128	active+remapped	2013-03-20 17:14:08.006824 
17'23	36'53	[0]	[0,1]	11'14	2013-03-20 16:46:13.639052	11'14	2013-03-20 
16:46:13.639052
1.3b	1	0	0	0	2338546	4224	4224	active+remapped	2013-03-20 
17:13:22.018020	41'33	36'87	[0]	[0,1]	0'0	2013-03-20 16:49:01.678543 
0'0	2013-03-20 16:49:01.678543
2.3a	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.022849	0'0 
36'45	[0]	[0,1]	0'0	2013-03-20 16:52:06.728006	0'0	2013-03-20 
16:52:06.728006
0.35	0	0	0	0	0	4216	4216	active+remapped	2013-03-20 17:14:08.006831 
17'31	36'47	[0]	[0,1]	11'23	2013-03-20 16:46:05.636185	11'23	2013-03-20 
16:46:05.636185
1.34	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.036661	0'0 
36'45	[0]	[0,1]	0'0	2013-03-20 16:48:46.674504	0'0	2013-03-20 
16:48:46.674504
2.33	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.048476	0'0 
36'45	[0]	[0,1]	0'0	2013-03-20 16:51:49.724215	0'0	2013-03-20 
16:51:49.724215
0.21	0	0	0	0	0	1360	1360	active+remapped	2013-03-20 17:15:10.953645 
17'10	39'20	[1]	[1,0]	0'0	0.00	0'0	0.00
1.20	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.290933	0'0 
39'19	[1]	[1,0]	0'0	0.00	0'0	0.00
2.1f	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.309581	0'0 
39'19	[1]	[1,0]	0'0	0.00	0'0	0.00
0.1d	0	0	0	0	0	4080	4080	active+remapped	2013-03-20 17:14:08.006880 
17'30	36'124	[0]	[0,1]	11'20	2013-03-20 16:43:51.560375	11'20	2013-03-20 
16:43:51.560375
1.1c	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.131767	0'0 
36'83	[0]	[0,1]	0'0	2013-03-20 16:46:06.593051	0'0	2013-03-20 
16:46:06.593051
2.1b	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:13:22.148274	0'0 
36'83	[0]	[0,1]	0'0	2013-03-20 16:48:39.633091	0'0	2013-03-20 
16:48:39.633091
0.15	0	0	0	0	0	1768	1768	active+degraded	2013-03-20 17:14:04.005586 
17'13	36'80	[0]	[0]	0'0	0.00	0'0	0.00
1.14	2	0	2	0	512	2308	2308	active+degraded	2013-03-20 
17:13:18.967086	41'18	36'89	[0]	[0]	0'0	0.00	0'0	0.00
0.14	0	0	0	0	0	2448	2448	active+remapped	2013-03-20 17:15:10.953657 
17'18	39'83	[1]	[1,0]	11'9	2013-03-20 16:43:37.556698	11'9	2013-03-20 
16:43:37.556698
1.13	1	0	0	0	29	129	129	active+remapped	2013-03-20 17:14:25.350437 
3'1	39'53	[1]	[1,0]	3'1	2013-03-20 16:45:55.590867	3'1	2013-03-20 
16:45:55.590867
2.13	0	0	0	0	0	0	0	active+degraded	2013-03-20 17:13:18.968930	0'0 
36'66	[0]	[0]	0'0	0.00	0'0	0.00
2.12	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.396528	0'0 
39'75	[1]	[1,0]	0'0	2013-03-20 16:48:35.632422	0'0	2013-03-20 
16:48:35.632422
2.c	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:14:25.400472	0'0 
39'47	[1]	[1,0]	0'0	2013-03-20 16:51:13.713841	0'0	2013-03-20 
16:51:13.713841
0.e	0	0	0	0	0	1360	1360	active+remapped	2013-03-20 17:15:10.953677 
17'10	39'60	[1]	[1,0]	11'5	2013-03-20 16:45:03.617117	11'5	2013-03-20 
16:45:03.617117
1.d	0	0	0	0	0	0	0	active+remapped	2013-03-20 17:

Re: [ceph-users] Replacement hardware

2013-03-20 Thread Igor Laskovy
Actually, I already have recovered OSDs and MON daemon back to the cluster
according to http://ceph.com/docs/master/rados/operations/add-or-rm-osds/and
http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ .

But doc has missed info about removing/add MDS.
How I can recovery MDS daemon for failed node?


On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob)  wrote:

> Igor,
>
> I am sure that I'm right in saying that you just have to create a new
> filesystem (btrfs?) on the new block device, mount it, and then
> initialise the osd with:
>
> ceph-osd -i  --mkfs
>
> Then you can start the osd with:
>
> ceph-osd -i 
>
> Since you are replacing an osd that already existed, the cluster knows
> about it, and there is a key for it that is known.
>
> I don't claim any great expertise, but this is what I've been doing, and
> the cluster seems to adopt the new osd and sort everything out.
>
> David
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Igor Laskovy
facebook.com/igor.laskovy
Kiev, Ukraine
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacement hardware

2013-03-20 Thread Greg Farnum
The MDS doesn't have any local state. You just need start up the daemon 
somewhere with a name and key that are known to the cluster (these can be 
different from or the same as the one that existed on the dead node; doesn't 
matter!). 
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote:

> Actually, I already have recovered OSDs and MON daemon back to the cluster 
> according to http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and 
> http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . 
> 
> But doc has missed info about removing/add MDS.
> How I can recovery MDS daemon for failed node?
> 
> 
> 
> On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob)  (mailto:d...@bob-the-boat.me.uk)> wrote:
> > Igor,
> > 
> > I am sure that I'm right in saying that you just have to create a new
> > filesystem (btrfs?) on the new block device, mount it, and then
> > initialise the osd with:
> > 
> > ceph-osd -i  --mkfs
> > 
> > Then you can start the osd with:
> > 
> > ceph-osd -i 
> > 
> > Since you are replacing an osd that already existed, the cluster knows
> > about it, and there is a key for it that is known.
> > 
> > I don't claim any great expertise, but this is what I've been doing, and
> > the cluster seems to adopt the new osd and sort everything out.
> > 
> > David
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> -- 
> Igor Laskovy
> facebook.com/igor.laskovy (http://facebook.com/igor.laskovy)
> Kiev, Ukraine 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacement hardware

2013-03-20 Thread Igor Laskovy
Well, can you please clarify what exactly key I must to use? Do I need to
get/generate it somehow from working cluster?


On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum  wrote:

> The MDS doesn't have any local state. You just need start up the daemon
> somewhere with a name and key that are known to the cluster (these can be
> different from or the same as the one that existed on the dead node;
> doesn't matter!).
> -Greg
>
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote:
>
> > Actually, I already have recovered OSDs and MON daemon back to the
> cluster according to
> http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and
> http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ .
> >
> > But doc has missed info about removing/add MDS.
> > How I can recovery MDS daemon for failed node?
> >
> >
> >
> > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob)  d...@bob-the-boat.me.uk)> wrote:
> > > Igor,
> > >
> > > I am sure that I'm right in saying that you just have to create a new
> > > filesystem (btrfs?) on the new block device, mount it, and then
> > > initialise the osd with:
> > >
> > > ceph-osd -i  --mkfs
> > >
> > > Then you can start the osd with:
> > >
> > > ceph-osd -i 
> > >
> > > Since you are replacing an osd that already existed, the cluster knows
> > > about it, and there is a key for it that is known.
> > >
> > > I don't claim any great expertise, but this is what I've been doing,
> and
> > > the cluster seems to adopt the new osd and sort everything out.
> > >
> > > David
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> >
> > --
> > Igor Laskovy
> > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy)
> > Kiev, Ukraine
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>


-- 
Igor Laskovy
facebook.com/igor.laskovy
Kiev, Ukraine
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacement hardware

2013-03-20 Thread Greg Farnum
Yeah. If you run "ceph auth list" you'll get a dump of all the users and keys 
the cluster knows about; each of your daemons has that key stored somewhere 
locally (generally in /var/lib/ceph/ceph-[osd|mds|mon].$id). You can create 
more or copy an unused MDS one. I believe the docs include information on how 
this works. 
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wednesday, March 20, 2013 at 10:48 AM, Igor Laskovy wrote:

> Well, can you please clarify what exactly key I must to use? Do I need to 
> get/generate it somehow from working cluster?
> 
> 
> On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum  (mailto:g...@inktank.com)> wrote:
> > The MDS doesn't have any local state. You just need start up the daemon 
> > somewhere with a name and key that are known to the cluster (these can be 
> > different from or the same as the one that existed on the dead node; 
> > doesn't matter!).
> > -Greg
> > 
> > Software Engineer #42 @ http://inktank.com | http://ceph.com
> > 
> > 
> > On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote:
> > 
> > > Actually, I already have recovered OSDs and MON daemon back to the 
> > > cluster according to 
> > > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and 
> > > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ .
> > > 
> > > But doc has missed info about removing/add MDS.
> > > How I can recovery MDS daemon for failed node?
> > > 
> > > 
> > > 
> > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob)  > > (mailto:d...@bob-the-boat.me.uk) (mailto:d...@bob-the-boat.me.uk)> wrote:
> > > > Igor,
> > > > 
> > > > I am sure that I'm right in saying that you just have to create a new
> > > > filesystem (btrfs?) on the new block device, mount it, and then
> > > > initialise the osd with:
> > > > 
> > > > ceph-osd -i  --mkfs
> > > > 
> > > > Then you can start the osd with:
> > > > 
> > > > ceph-osd -i 
> > > > 
> > > > Since you are replacing an osd that already existed, the cluster knows
> > > > about it, and there is a key for it that is known.
> > > > 
> > > > I don't claim any great expertise, but this is what I've been doing, and
> > > > the cluster seems to adopt the new osd and sort everything out.
> > > > 
> > > > David
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) 
> > > > (mailto:ceph-users@lists.ceph.com)
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Igor Laskovy
> > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) 
> > > (http://facebook.com/igor.laskovy)
> > > Kiev, Ukraine
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) 
> > > (mailto:ceph-users@lists.ceph.com)
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> 
> 
> 
> 
> -- 
> Igor Laskovy
> facebook.com/igor.laskovy (http://facebook.com/igor.laskovy)
> Kiev, Ukraine 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replacement hardware

2013-03-20 Thread Igor Laskovy
Oh, thank you!


On Wed, Mar 20, 2013 at 7:52 PM, Greg Farnum  wrote:

> Yeah. If you run "ceph auth list" you'll get a dump of all the users and
> keys the cluster knows about; each of your daemons has that key stored
> somewhere locally (generally in /var/lib/ceph/ceph-[osd|mds|mon].$id). You
> can create more or copy an unused MDS one. I believe the docs include
> information on how this works.
> -Greg
>
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Wednesday, March 20, 2013 at 10:48 AM, Igor Laskovy wrote:
>
> > Well, can you please clarify what exactly key I must to use? Do I need
> to get/generate it somehow from working cluster?
> >
> >
> > On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum  g...@inktank.com)> wrote:
> > > The MDS doesn't have any local state. You just need start up the
> daemon somewhere with a name and key that are known to the cluster (these
> can be different from or the same as the one that existed on the dead node;
> doesn't matter!).
> > > -Greg
> > >
> > > Software Engineer #42 @ http://inktank.com | http://ceph.com
> > >
> > >
> > > On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote:
> > >
> > > > Actually, I already have recovered OSDs and MON daemon back to the
> cluster according to
> http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and
> http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ .
> > > >
> > > > But doc has missed info about removing/add MDS.
> > > > How I can recovery MDS daemon for failed node?
> > > >
> > > >
> > > >
> > > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) 
> > > >  d...@bob-the-boat.me.uk) (mailto:d...@bob-the-boat.me.uk)> wrote:
> > > > > Igor,
> > > > >
> > > > > I am sure that I'm right in saying that you just have to create a
> new
> > > > > filesystem (btrfs?) on the new block device, mount it, and then
> > > > > initialise the osd with:
> > > > >
> > > > > ceph-osd -i  --mkfs
> > > > >
> > > > > Then you can start the osd with:
> > > > >
> > > > > ceph-osd -i 
> > > > >
> > > > > Since you are replacing an osd that already existed, the cluster
> knows
> > > > > about it, and there is a key for it that is known.
> > > > >
> > > > > I don't claim any great expertise, but this is what I've been
> doing, and
> > > > > the cluster seems to adopt the new osd and sort everything out.
> > > > >
> > > > > David
> > > > > ___
> > > > > ceph-users mailing list
> > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> (mailto:ceph-users@lists.ceph.com)
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Igor Laskovy
> > > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) (
> http://facebook.com/igor.laskovy)
> > > > Kiev, Ukraine
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> (mailto:ceph-users@lists.ceph.com)
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> >
> >
> >
> >
> > --
> > Igor Laskovy
> > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy)
> > Kiev, Ukraine
>
>
>
>


-- 
Igor Laskovy
facebook.com/igor.laskovy
Kiev, Ukraine
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bad drive caused radosgw to timeout with http 500s

2013-03-20 Thread Jeppesen, Nelson
Hello Ceph-Users,

I was testing our rados gateway and after a few hours rgw started sending http 
500 responses for certain uploads. I did some digging and found that a HDD 
died. The OSD was marked out, but not after a short rgw outage. Start to finish 
was 60 to 120 seconds.

I have a few questions;

1) Fastcgi timed out after 30 seconds. If I raise the timeout to 120 seconds, 
will that protect me from future HDD failures? 
Example of the error.log from apache:

[error] [client 10.194.255.14] FastCGI: incomplete headers (0 bytes) 
received from server "/var/www/s3gw.fcgi"
[error] [client 10.194.255.1] FastCGI: comm with server 
"/var/www/s3gw.fcgi" aborted: idle timeout (30 sec)

2) Why did it take so long for Ceph to recover? 

3) Anything I can to improve HDD failure resiliency?

Thank you. 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Crush map example

2013-03-20 Thread Darryl Bond

I have a cluster of 3 hosts each with 2 SSD and 4 Spinning disks.
I used the example in th ecrush map doco to create a crush map to place
the primary on the SSD and replica on spinning disk.

If I use the example, I end up with objects replicated on the same host,
if I use 2 replicas.

Question 1, is the documentation on the rules correct, should they
really be both ruleset 4 and why? I used ruleset 5 for the ssd-primary.
  rule ssd {
  ruleset 4
  type replicated
  min_size 0
  max_size 10
  step take ssd
  step chooseleaf firstn 0 type host
  step emit
  }

  rule ssd-primary {
  ruleset 4
  type replicated
  min_size 0
  max_size 10
  step take ssd
  step chooseleaf firstn 1 type host
  step emit
  step take platter
  step chooseleaf firstn -1 type host
  step emit
  }

Question 2, Is there any way to ensure that the replicas are on
different hosts when we use double rooted trees for the 2 technologies?
Obviously, the simplest way is to have them on separate hosts.

For the moment, I have increased the number of replicas in the pool to 3
which does ensure that there is at least copies spread across multiple
hosts.

Darryl


The contents of this electronic message and any attachments are intended only 
for the addressee and may contain legally privileged, personal, sensitive or 
confidential information. If you are not the intended addressee, and have 
received this email, any transmission, distribution, downloading, printing or 
photocopying of the contents of this message or attachments is strictly 
prohibited. Any legal privilege or confidentiality attached to this message and 
attachments is not waived, lost or destroyed by reason of delivery to any 
person other than intended addressee. If you have received this message and are 
not the intended addressee you should notify the sender by return email and 
destroy all copies of the message and any attachments. Unless expressly 
attributed, the views expressed in this email do not necessarily represent the 
views of the company.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crush map example

2013-03-20 Thread Gregory Farnum
On Wed, Mar 20, 2013 at 5:06 PM, Darryl Bond  wrote:
> I have a cluster of 3 hosts each with 2 SSD and 4 Spinning disks.
> I used the example in th ecrush map doco to create a crush map to place
> the primary on the SSD and replica on spinning disk.
>
> If I use the example, I end up with objects replicated on the same host,
> if I use 2 replicas.
>
> Question 1, is the documentation on the rules correct, should they
> really be both ruleset 4 and why? I used ruleset 5 for the ssd-primary.
>   rule ssd {
>   ruleset 4
>   type replicated
>   min_size 0
>   max_size 10
>   step take ssd
>   step chooseleaf firstn 0 type host
>   step emit
>   }
>
>   rule ssd-primary {
>   ruleset 4
>   type replicated
>   min_size 0
>   max_size 10
>   step take ssd
>   step chooseleaf firstn 1 type host
>   step emit
>   step take platter
>   step chooseleaf firstn -1 type host
>   step emit
>   }

Hmm, no, those should both be different rulesets. You use the same
ruleset if you want to specify different placements depending on how
many replicas you're using on a particular pool (so that you could for
instance use the same ruleset for all your pools, but have higher
replication counts imply 2 SSD copies instead of just 1 or something).


> Question 2, Is there any way to ensure that the replicas are on
> different hosts when we use double rooted trees for the 2 technologies?
> Obviously, the simplest way is to have them on separate hosts.

Sadly, CRUSH doesn't support this kind of thing right now; if you want
to do it properly you should have different kinds of storage
segregated by host.
Extensions to CRUSH to enable this kind of behavior are on our list of
starter projects for interns and external contributors, and we push it
from time to time, so this could be coming in the future — just don't
count on it by any particular date. :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com