[ceph-users] IFFD_FINANCIAL_DONATION
IFFD - INTERNATIONAL FUND FOR DEVELOPMENT NOBEL HOUSE, 17 SMITH SQUARE LONDON SW1P 3JR UNITED KINGDOM Tel: 00448719157181 developmentf...@ymail.com developmentf...@europemail.com IFFD is an International Financial Institution and a Specialized Agency of the United Nations whose mission is to enable poor rural people to overcome poverty. IFFD's headquarters is in London, United Kingdom, and its staff work with poor rural people, their governments, communities and organizations in more than 80 countries to develop and finance programmes and projects aimed at increasing rural productivity and incomes. The Programme Management Department (PMD) is responsible for the overall lending programme of the Fund, and is composed of five regional divisions and a Policy and Technical Advisory Services Division. Within the Latin America, Caribbean Division and Asia Continent the Senior Rural Development Specialist works with staff to increase the comparative advantage of IFFD in the region by Providing timely and authoritative analytical and advisory capacity to staff in appropriate forms to enhance the quality, relevance and effectiveness of LAC's work and instruments (Country Strategic Opportunities Programmes [COSOPs], projects, grants, policy dialogue, knowledge management, South-South Cooperation, portfolio review and M&E). Developing and implementing strategies, systems and approaches to better integrate the different types of operations, allowing IFFD to enhance its value-addition capacity, its scale of operations, and its impact on poverty reduction in the region. Therefore, the IFFD is welcoming grantee applications from around the globe to funding individuals, organizations, industries and institutional bodies and other related sections to reduce the poverty menace globally for which each application is allowed to receive funds not more than One Million United State Dollar (1,000,000.00USD). kindly fill out the below form. Individual /organization full name... 2. Address.City..State.Country... 3. Marital status if individual... 4. Occupation. 5. Age 6. Contact phone number... 7. Email Address.. 8. BANK NAME 9. BANK ADDRESS.. 10. BANK ACCOUNT NUMBER . 11. IFSC CODE developmentf...@ymail.com developmentf...@europemail.com Regards David Brook IFFD Co-ordinator Tel: 00448719157181 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] waiting for 1 open ops to drain
I am using ceph 0.58 and kernel 3.9-rc2 and btrfs on my osds. I have an osd that starts up but blocks with the log message 'waiting for 1 open ops to drain'. This never happens, and I can't get the osd 'up'. I need to clear this problem. I have recently had an osd go problematic and I have recreated a fresh btrfs filesystem on the problem osd drive. I have also added a completely new osd. The 'waiting for 1 open ops to drain' problem has occurred before the cluster has recovered from the earlier surgery and I need to get the data from this osd. I have increased the number of copies from 2 to 3 to give me more resilience in the future, but that has not taken effect yet. Once I get the cluster back to health, I will mkfs.btrfs and rebuild this osd and one other that is a legacy from earlier kernel/ceph versions. How can I tell the osd not to bother with waiting for its open ops to drain? Thank you in anticipation. David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] create volume from an image
Thanks josh,the problem is solved by updating ceph in the glance node. 发自我的 iPhone 在 2013-3-20,14:59,"Josh Durgin" 写道: > On 03/19/2013 11:03 PM, Chen, Xiaoxi wrote: >> I think Josh may be the right man for this question ☺ >> >> To be more precious, I would like to add more words about the status: >> >> 1. We have configured “show_image_direct_url= Ture” in Glance, and from the >> Cinder-volume’s log, we can make sure we have got a direct_url , for example. >> image_id 6565d775-553b-41b6-9d5e-ddb825677706 >> image_location rbd://6565d775-553b-41b6-9d5e-ddb825677706 >> 2. In the _is_cloneable function, it tries to “_parse_location” the >> direct_url (rbd://6565d775-553b-41b6-9d5e-ddb825677706) into 4 parts : >> fsid,pool,volume,snapshot . Since the direct_url passed from Glance doesn’t >> provide fsid ,pool and snapshot info, the parse is failed and _is_cloneable >> return false, which will finally drop the request to >> RBDDriver::copy_image_to_volume. > > This is working as expected - cloning was introduced in format 2 rbd > volumes, available in bobtail but not argonaut. When the image is > uploaded to glance, it is created as format 2 and a snapshot of it is > taken and protected from deletion if the installed version of librbd > (via the python-ceph package) supports it. > > The location reported will be just the image id for format 1 images. > For format 2 images, it has 4 parts, as you noted. You may need to > update python-ceph and librbd1 on the node running glance-api > and re-upload the image so it will be created as format 2, rather > than the current image which is format 1, and thus cannot be cloned. > >> 3. In Cinder/volume/driver.py,RBDDriver::copy_image_to_volume, we have seem >> this note: >> # TODO(jdurgin): replace with librbd this is a temporary hack, since >> rewriting this driver to use librbd would take too long >> And in this function, the cinder RBD driver download the whole >> image from Glance into a temp file in local Filesystem, then use rbd import >> to import the temp file into a RBD volume. > > That note is about needing to remove the volume before importing the > data, instead of just writing to it directly with librbd. > >>This is absolutely not what we want (zero copy and CoW), so we >> digging into the _is_cloneable function >> >> Seems the straightforward way to solve 2 ) write a patch for glance that >> adding more infos in the direct_url, but I am not sure if it’s possible for >> ceph to clone a RBD from pool A to pool B? > > Cloning from one pool to another is certainly supported. If you're > interested in more details about cloning, check out the command line > usage [1] and internal design [2]. > > Josh > > [1] https://github.com/ceph/ceph/blob/master/doc/rbd/rbd-snapshot.rst#layering > [2] https://github.com/ceph/ceph/blob/master/doc/dev/rbd-layering.rst > >> From: ceph-users-boun...@lists.ceph.com >> [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Li, Chen >> Sent: 2013年3月20日 12:57 >> To: 'ceph-users@lists.ceph.com' >> Subject: [ceph-users] create volume from an image >> >> I'm using Ceph RBD for both Cinder and Glance. Cinder and Glance are >> installed in two machines. >> I have get information from many place that when cinder and glance both >> using Ceph RBD, then no real data transmit will happen because of copy on >> write. >> But the truth is when i run the command: >> cinder create --image-id 6565d775-553b-41b6-9d5e-ddb825677706 --display-name >> test 3 >> I can still get network data traffic between cinder and glance. >> And I check the cinder code, the image_location is None >> (cinder/volume/manager.py), which makes cinder will failed running cloned = >> self.driver.clone_image(volume_ref, image_location). >> Is this a OpenStack (cinder or glance )bug ? >> Or I have miss and configuration? >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Replacement hardware
Hi there! What steps needs to be perform if we have totally lost a node. As I already understand from docs, OSDs must be recreated (disabled, removed and again created, right?) But what about MON and MDS? -- Igor Laskovy facebook.com/igor.laskovy Kiev, Ukraine ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Calculate and increase pg_num
Dan, Sebastién, thanks for the hints. For Inktank: There's no doubt that setting the appropriate pg_num is a very important parameter to set, and due to the lack of a stable command to increase them, the workaround by Sébastien Han along with the advises by Dan van der Ster should be included in the documentation -- Marco Aroldi 2013/3/15 Dan van der Ster : > On Fri, Mar 15, 2013 at 4:44 PM, Marco Aroldi wrote: >> Dan, >> this sound weird: >> how can you run "cephfs /mnt/mycephfs set_layout 10" on a unmounted >> mountpoint? > > We had cephfs still mounted from earlier (before the copy pool, delete > pool). Basically, any file reads resulted in a I/O error, but > nevertheless the set_layout worked. We didn't try mounting cephfs > again when the root pool didn't exist. > > Cheers, Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Replacement hardware
Igor, I am sure that I'm right in saying that you just have to create a new filesystem (btrfs?) on the new block device, mount it, and then initialise the osd with: ceph-osd -i --mkfs Then you can start the osd with: ceph-osd -i Since you are replacing an osd that already existed, the cluster knows about it, and there is a key for it that is known. I don't claim any great expertise, but this is what I've been doing, and the cluster seems to adopt the new osd and sort everything out. David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] pgs stuck unclean
Hello! I've deployed a test ceph cluster according to this guide: http://ceph.com/docs/master/start/quick-start/ The problem is that the cluster will never go to a clean state by itself. The corresponding outputs are the following: root@test-4:~# ceph health HEALTH_WARN 3 pgs degraded; 38 pgs stuck unclean; recovery 2/44 degraded (4.545%) root@test-4:~# ceph -s health HEALTH_WARN 3 pgs degraded; 38 pgs stuck unclean; recovery 2/44 degraded (4.545%) monmap e1: 1 mons at {a=10.0.0.3:6789/0}, election epoch 1, quorum 0 a osdmap e45: 2 osds: 2 up, 2 in pgmap v344: 384 pgs: 346 active+clean, 35 active+remapped, 3 active+degraded; 6387 KB data, 2025 MB used, 193 GB / 200 GB avail; 2/44 degraded (4.545%) mdsmap e29: 1/1/1 up {0=a=up:active} root@test-4:~# ceph pg dump_stuck unclean ok pg_stat objects mip degr unf bytes log disklog state state_stamp v reported up acting last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 1.6b 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.056155 0'0 36'45 [0] [0,1] 0'0 2013-03-20 16:50:19.699765 0'0 2013-03-20 16:50:19.699765 2.6a 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.062933 0'0 36'45 [0] [0,1] 0'0 2013-03-20 16:53:22.749668 0'0 2013-03-20 16:53:22.749668 0.62 0 0 0 0 0 2584 2584 active+remapped 2013-03-20 17:15:10.953654 17'19 39'63 [1] [1,0] 11'12 2013-03-20 16:46:48.646752 11'12 2013-03-20 16:46:48.646752 2.60 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.331682 0'0 39'47 [1] [1,0] 0'0 2013-03-20 16:53:04.744990 0'0 2013-03-20 16:53:04.744990 1.61 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.345445 0'0 39'47 [1] [1,0] 0'0 2013-03-20 16:49:58.694300 0'0 2013-03-20 16:49:58.694300 2.45 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.179279 0'0 39'75 [1] [1,0] 0'0 2013-03-20 16:49:43.649700 0'0 2013-03-20 16:49:43.649700 1.46 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.179239 0'0 39'75 [1] [1,0] 0'0 2013-03-20 16:47:10.610772 0'0 2013-03-20 16:47:10.610772 0.47 0 0 0 0 0 3808 3808 active+remapped 2013-03-20 17:15:10.953601 17'28 39'93 [1] [1,0] 11'19 2013-03-20 16:44:31.572090 11'19 2013-03-20 16:44:31.572090 0.3c 0 0 0 0 0 3128 3128 active+remapped 2013-03-20 17:14:08.006824 17'23 36'53 [0] [0,1] 11'14 2013-03-20 16:46:13.639052 11'14 2013-03-20 16:46:13.639052 1.3b 1 0 0 0 2338546 4224 4224 active+remapped 2013-03-20 17:13:22.018020 41'33 36'87 [0] [0,1] 0'0 2013-03-20 16:49:01.678543 0'0 2013-03-20 16:49:01.678543 2.3a 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.022849 0'0 36'45 [0] [0,1] 0'0 2013-03-20 16:52:06.728006 0'0 2013-03-20 16:52:06.728006 0.35 0 0 0 0 0 4216 4216 active+remapped 2013-03-20 17:14:08.006831 17'31 36'47 [0] [0,1] 11'23 2013-03-20 16:46:05.636185 11'23 2013-03-20 16:46:05.636185 1.34 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.036661 0'0 36'45 [0] [0,1] 0'0 2013-03-20 16:48:46.674504 0'0 2013-03-20 16:48:46.674504 2.33 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.048476 0'0 36'45 [0] [0,1] 0'0 2013-03-20 16:51:49.724215 0'0 2013-03-20 16:51:49.724215 0.21 0 0 0 0 0 1360 1360 active+remapped 2013-03-20 17:15:10.953645 17'10 39'20 [1] [1,0] 0'0 0.00 0'0 0.00 1.20 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.290933 0'0 39'19 [1] [1,0] 0'0 0.00 0'0 0.00 2.1f 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.309581 0'0 39'19 [1] [1,0] 0'0 0.00 0'0 0.00 0.1d 0 0 0 0 0 4080 4080 active+remapped 2013-03-20 17:14:08.006880 17'30 36'124 [0] [0,1] 11'20 2013-03-20 16:43:51.560375 11'20 2013-03-20 16:43:51.560375 1.1c 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.131767 0'0 36'83 [0] [0,1] 0'0 2013-03-20 16:46:06.593051 0'0 2013-03-20 16:46:06.593051 2.1b 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:13:22.148274 0'0 36'83 [0] [0,1] 0'0 2013-03-20 16:48:39.633091 0'0 2013-03-20 16:48:39.633091 0.15 0 0 0 0 0 1768 1768 active+degraded 2013-03-20 17:14:04.005586 17'13 36'80 [0] [0] 0'0 0.00 0'0 0.00 1.14 2 0 2 0 512 2308 2308 active+degraded 2013-03-20 17:13:18.967086 41'18 36'89 [0] [0] 0'0 0.00 0'0 0.00 0.14 0 0 0 0 0 2448 2448 active+remapped 2013-03-20 17:15:10.953657 17'18 39'83 [1] [1,0] 11'9 2013-03-20 16:43:37.556698 11'9 2013-03-20 16:43:37.556698 1.13 1 0 0 0 29 129 129 active+remapped 2013-03-20 17:14:25.350437 3'1 39'53 [1] [1,0] 3'1 2013-03-20 16:45:55.590867 3'1 2013-03-20 16:45:55.590867 2.13 0 0 0 0 0 0 0 active+degraded 2013-03-20 17:13:18.968930 0'0 36'66 [0] [0] 0'0 0.00 0'0 0.00 2.12 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.396528 0'0 39'75 [1] [1,0] 0'0 2013-03-20 16:48:35.632422 0'0 2013-03-20 16:48:35.632422 2.c 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:14:25.400472 0'0 39'47 [1] [1,0] 0'0 2013-03-20 16:51:13.713841 0'0 2013-03-20 16:51:13.713841 0.e 0 0 0 0 0 1360 1360 active+remapped 2013-03-20 17:15:10.953677 17'10 39'60 [1] [1,0] 11'5 2013-03-20 16:45:03.617117 11'5 2013-03-20 16:45:03.617117 1.d 0 0 0 0 0 0 0 active+remapped 2013-03-20 17:
Re: [ceph-users] Replacement hardware
Actually, I already have recovered OSDs and MON daemon back to the cluster according to http://ceph.com/docs/master/rados/operations/add-or-rm-osds/and http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . But doc has missed info about removing/add MDS. How I can recovery MDS daemon for failed node? On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) wrote: > Igor, > > I am sure that I'm right in saying that you just have to create a new > filesystem (btrfs?) on the new block device, mount it, and then > initialise the osd with: > > ceph-osd -i --mkfs > > Then you can start the osd with: > > ceph-osd -i > > Since you are replacing an osd that already existed, the cluster knows > about it, and there is a key for it that is known. > > I don't claim any great expertise, but this is what I've been doing, and > the cluster seems to adopt the new osd and sort everything out. > > David > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Igor Laskovy facebook.com/igor.laskovy Kiev, Ukraine ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacement hardware
The MDS doesn't have any local state. You just need start up the daemon somewhere with a name and key that are known to the cluster (these can be different from or the same as the one that existed on the dead node; doesn't matter!). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote: > Actually, I already have recovered OSDs and MON daemon back to the cluster > according to http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . > > But doc has missed info about removing/add MDS. > How I can recovery MDS daemon for failed node? > > > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) (mailto:d...@bob-the-boat.me.uk)> wrote: > > Igor, > > > > I am sure that I'm right in saying that you just have to create a new > > filesystem (btrfs?) on the new block device, mount it, and then > > initialise the osd with: > > > > ceph-osd -i --mkfs > > > > Then you can start the osd with: > > > > ceph-osd -i > > > > Since you are replacing an osd that already existed, the cluster knows > > about it, and there is a key for it that is known. > > > > I don't claim any great expertise, but this is what I've been doing, and > > the cluster seems to adopt the new osd and sort everything out. > > > > David > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Igor Laskovy > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) > Kiev, Ukraine > ___ > ceph-users mailing list > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacement hardware
Well, can you please clarify what exactly key I must to use? Do I need to get/generate it somehow from working cluster? On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum wrote: > The MDS doesn't have any local state. You just need start up the daemon > somewhere with a name and key that are known to the cluster (these can be > different from or the same as the one that existed on the dead node; > doesn't matter!). > -Greg > > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote: > > > Actually, I already have recovered OSDs and MON daemon back to the > cluster according to > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . > > > > But doc has missed info about removing/add MDS. > > How I can recovery MDS daemon for failed node? > > > > > > > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) d...@bob-the-boat.me.uk)> wrote: > > > Igor, > > > > > > I am sure that I'm right in saying that you just have to create a new > > > filesystem (btrfs?) on the new block device, mount it, and then > > > initialise the osd with: > > > > > > ceph-osd -i --mkfs > > > > > > Then you can start the osd with: > > > > > > ceph-osd -i > > > > > > Since you are replacing an osd that already existed, the cluster knows > > > about it, and there is a key for it that is known. > > > > > > I don't claim any great expertise, but this is what I've been doing, > and > > > the cluster seems to adopt the new osd and sort everything out. > > > > > > David > > > ___ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > -- > > Igor Laskovy > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) > > Kiev, Ukraine > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- Igor Laskovy facebook.com/igor.laskovy Kiev, Ukraine ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacement hardware
Yeah. If you run "ceph auth list" you'll get a dump of all the users and keys the cluster knows about; each of your daemons has that key stored somewhere locally (generally in /var/lib/ceph/ceph-[osd|mds|mon].$id). You can create more or copy an unused MDS one. I believe the docs include information on how this works. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wednesday, March 20, 2013 at 10:48 AM, Igor Laskovy wrote: > Well, can you please clarify what exactly key I must to use? Do I need to > get/generate it somehow from working cluster? > > > On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum (mailto:g...@inktank.com)> wrote: > > The MDS doesn't have any local state. You just need start up the daemon > > somewhere with a name and key that are known to the cluster (these can be > > different from or the same as the one that existed on the dead node; > > doesn't matter!). > > -Greg > > > > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > > > On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote: > > > > > Actually, I already have recovered OSDs and MON daemon back to the > > > cluster according to > > > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and > > > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . > > > > > > But doc has missed info about removing/add MDS. > > > How I can recovery MDS daemon for failed node? > > > > > > > > > > > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) > > (mailto:d...@bob-the-boat.me.uk) (mailto:d...@bob-the-boat.me.uk)> wrote: > > > > Igor, > > > > > > > > I am sure that I'm right in saying that you just have to create a new > > > > filesystem (btrfs?) on the new block device, mount it, and then > > > > initialise the osd with: > > > > > > > > ceph-osd -i --mkfs > > > > > > > > Then you can start the osd with: > > > > > > > > ceph-osd -i > > > > > > > > Since you are replacing an osd that already existed, the cluster knows > > > > about it, and there is a key for it that is known. > > > > > > > > I don't claim any great expertise, but this is what I've been doing, and > > > > the cluster seems to adopt the new osd and sort everything out. > > > > > > > > David > > > > ___ > > > > ceph-users mailing list > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > > > (mailto:ceph-users@lists.ceph.com) > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > > > > > > > -- > > > Igor Laskovy > > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) > > > (http://facebook.com/igor.laskovy) > > > Kiev, Ukraine > > > ___ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > > > (mailto:ceph-users@lists.ceph.com) > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > -- > Igor Laskovy > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) > Kiev, Ukraine ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replacement hardware
Oh, thank you! On Wed, Mar 20, 2013 at 7:52 PM, Greg Farnum wrote: > Yeah. If you run "ceph auth list" you'll get a dump of all the users and > keys the cluster knows about; each of your daemons has that key stored > somewhere locally (generally in /var/lib/ceph/ceph-[osd|mds|mon].$id). You > can create more or copy an unused MDS one. I believe the docs include > information on how this works. > -Greg > > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Wednesday, March 20, 2013 at 10:48 AM, Igor Laskovy wrote: > > > Well, can you please clarify what exactly key I must to use? Do I need > to get/generate it somehow from working cluster? > > > > > > On Wed, Mar 20, 2013 at 7:41 PM, Greg Farnum g...@inktank.com)> wrote: > > > The MDS doesn't have any local state. You just need start up the > daemon somewhere with a name and key that are known to the cluster (these > can be different from or the same as the one that existed on the dead node; > doesn't matter!). > > > -Greg > > > > > > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > > > > > > On Wednesday, March 20, 2013 at 10:40 AM, Igor Laskovy wrote: > > > > > > > Actually, I already have recovered OSDs and MON daemon back to the > cluster according to > http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ and > http://ceph.com/docs/master/rados/operations/add-or-rm-mons/ . > > > > > > > > But doc has missed info about removing/add MDS. > > > > How I can recovery MDS daemon for failed node? > > > > > > > > > > > > > > > > On Wed, Mar 20, 2013 at 3:23 PM, Dave (Bob) > > > > d...@bob-the-boat.me.uk) (mailto:d...@bob-the-boat.me.uk)> wrote: > > > > > Igor, > > > > > > > > > > I am sure that I'm right in saying that you just have to create a > new > > > > > filesystem (btrfs?) on the new block device, mount it, and then > > > > > initialise the osd with: > > > > > > > > > > ceph-osd -i --mkfs > > > > > > > > > > Then you can start the osd with: > > > > > > > > > > ceph-osd -i > > > > > > > > > > Since you are replacing an osd that already existed, the cluster > knows > > > > > about it, and there is a key for it that is known. > > > > > > > > > > I don't claim any great expertise, but this is what I've been > doing, and > > > > > the cluster seems to adopt the new osd and sort everything out. > > > > > > > > > > David > > > > > ___ > > > > > ceph-users mailing list > > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > (mailto:ceph-users@lists.ceph.com) > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Igor Laskovy > > > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) ( > http://facebook.com/igor.laskovy) > > > > Kiev, Ukraine > > > > ___ > > > > ceph-users mailing list > > > > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) > (mailto:ceph-users@lists.ceph.com) > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > > > -- > > Igor Laskovy > > facebook.com/igor.laskovy (http://facebook.com/igor.laskovy) > > Kiev, Ukraine > > > > -- Igor Laskovy facebook.com/igor.laskovy Kiev, Ukraine ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bad drive caused radosgw to timeout with http 500s
Hello Ceph-Users, I was testing our rados gateway and after a few hours rgw started sending http 500 responses for certain uploads. I did some digging and found that a HDD died. The OSD was marked out, but not after a short rgw outage. Start to finish was 60 to 120 seconds. I have a few questions; 1) Fastcgi timed out after 30 seconds. If I raise the timeout to 120 seconds, will that protect me from future HDD failures? Example of the error.log from apache: [error] [client 10.194.255.14] FastCGI: incomplete headers (0 bytes) received from server "/var/www/s3gw.fcgi" [error] [client 10.194.255.1] FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: idle timeout (30 sec) 2) Why did it take so long for Ceph to recover? 3) Anything I can to improve HDD failure resiliency? Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Crush map example
I have a cluster of 3 hosts each with 2 SSD and 4 Spinning disks. I used the example in th ecrush map doco to create a crush map to place the primary on the SSD and replica on spinning disk. If I use the example, I end up with objects replicated on the same host, if I use 2 replicas. Question 1, is the documentation on the rules correct, should they really be both ruleset 4 and why? I used ruleset 5 for the ssd-primary. rule ssd { ruleset 4 type replicated min_size 0 max_size 10 step take ssd step chooseleaf firstn 0 type host step emit } rule ssd-primary { ruleset 4 type replicated min_size 0 max_size 10 step take ssd step chooseleaf firstn 1 type host step emit step take platter step chooseleaf firstn -1 type host step emit } Question 2, Is there any way to ensure that the replicas are on different hosts when we use double rooted trees for the 2 technologies? Obviously, the simplest way is to have them on separate hosts. For the moment, I have increased the number of replicas in the pool to 3 which does ensure that there is at least copies spread across multiple hosts. Darryl The contents of this electronic message and any attachments are intended only for the addressee and may contain legally privileged, personal, sensitive or confidential information. If you are not the intended addressee, and have received this email, any transmission, distribution, downloading, printing or photocopying of the contents of this message or attachments is strictly prohibited. Any legal privilege or confidentiality attached to this message and attachments is not waived, lost or destroyed by reason of delivery to any person other than intended addressee. If you have received this message and are not the intended addressee you should notify the sender by return email and destroy all copies of the message and any attachments. Unless expressly attributed, the views expressed in this email do not necessarily represent the views of the company. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Crush map example
On Wed, Mar 20, 2013 at 5:06 PM, Darryl Bond wrote: > I have a cluster of 3 hosts each with 2 SSD and 4 Spinning disks. > I used the example in th ecrush map doco to create a crush map to place > the primary on the SSD and replica on spinning disk. > > If I use the example, I end up with objects replicated on the same host, > if I use 2 replicas. > > Question 1, is the documentation on the rules correct, should they > really be both ruleset 4 and why? I used ruleset 5 for the ssd-primary. > rule ssd { > ruleset 4 > type replicated > min_size 0 > max_size 10 > step take ssd > step chooseleaf firstn 0 type host > step emit > } > > rule ssd-primary { > ruleset 4 > type replicated > min_size 0 > max_size 10 > step take ssd > step chooseleaf firstn 1 type host > step emit > step take platter > step chooseleaf firstn -1 type host > step emit > } Hmm, no, those should both be different rulesets. You use the same ruleset if you want to specify different placements depending on how many replicas you're using on a particular pool (so that you could for instance use the same ruleset for all your pools, but have higher replication counts imply 2 SSD copies instead of just 1 or something). > Question 2, Is there any way to ensure that the replicas are on > different hosts when we use double rooted trees for the 2 technologies? > Obviously, the simplest way is to have them on separate hosts. Sadly, CRUSH doesn't support this kind of thing right now; if you want to do it properly you should have different kinds of storage segregated by host. Extensions to CRUSH to enable this kind of behavior are on our list of starter projects for interns and external contributors, and we push it from time to time, so this could be coming in the future — just don't count on it by any particular date. :) -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com