[ceph-users] Musings

2014-08-14 Thread Robert LeBlanc
m at the same time like 2 out of 3). I've read through the online manual, so now I'm looking for personal perspectives that you may have. Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v0.84 released

2014-08-18 Thread Robert LeBlanc
This may be a better question for Federico. I've pulled the systemd stuff from git and I have it working, but only if I have the volumes listed in fstab. Is this the intended way that systemd will function for now or am I missing a step? I'm pretty new to systemd. Thanks, Robert LeBlan

Re: [ceph-users] v0.84 released

2014-08-19 Thread Robert LeBlanc
OK, I don't think the udev rules are on my machines. I built the cluster manually and not with ceph-deploy. I must have missed adding the rules in the manual or the Packages from Debian (Jessie) did not create them. Robert LeBlanc On Mon, Aug 18, 2014 at 5:49 PM, Sage Weil wrote: > On

Re: [ceph-users] v0.84 released

2014-08-19 Thread Robert LeBlanc
ht, a udev-trigger should mount and activate the OSD, and I won't have to manually run the init.d script? Thanks, Robert LeBlanc On Tue, Aug 19, 2014 at 9:21 AM, Sage Weil wrote: > On Tue, 19 Aug 2014, Robert LeBlanc wrote: > > OK, I don't think the udev rules are on my machi

Re: [ceph-users] Musings

2014-08-19 Thread Robert LeBlanc
is if the cluster (2+1) is HEALTHY, does the write return after 2 of the OSDs (itself and one replica) complete the write or only after all three have completed the write? We are planning to try to do some testing on this as well if a clear answer can't be found. Thank you, Robert LeBlan

Re: [ceph-users] Musings

2014-08-19 Thread Robert LeBlanc
Thanks, your responses have been helpful. On Tue, Aug 19, 2014 at 1:48 PM, Gregory Farnum wrote: > On Tue, Aug 19, 2014 at 11:18 AM, Robert LeBlanc > wrote: > > Greg, thanks for the reply, please see in-line. > > > > > > On Tue, Aug 19, 2014 at 11:

Re: [ceph-users] pool with cache pool and rbd export

2014-08-22 Thread Robert LeBlanc
hing there. Robert LeBlanc On Fri, Aug 22, 2014 at 12:41 PM, Andrei Mikhailovsky wrote: > Hello guys, > > I am planning to perform regular rbd pool off-site backup with rbd export > and export-diff. I've got a small ceph firefly cluster with an active > writeback cache pool made o

Re: [ceph-users] pool with cache pool and rbd export

2014-08-22 Thread Robert LeBlanc
ndrei > > > ----- Original Message - > From: "Robert LeBlanc" > To: "Andrei Mikhailovsky" > Cc: ceph-users@lists.ceph.com > Sent: Friday, 22 August, 2014 8:21:08 PM > Subject: Re: [ceph-users] pool with cache pool and rbd export > > > My understan

Re: [ceph-users] pool with cache pool and rbd export

2014-08-22 Thread Robert LeBlanc
I believe the scrubbing happens at the pool level, when the backend pool is scrubbed it is independent of the cache pool. It would be nice to get some definite answers from someone who knows a lot more. Robert LeBlanc On Fri, Aug 22, 2014 at 3:16 PM, Andrei Mikhailovsky wrote: > Does t

[ceph-users] Prioritize Heartbeat packets

2014-08-27 Thread Robert LeBlanc
s our reasoning sound in this regard? Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Prioritize Heartbeat packets

2014-08-27 Thread Robert LeBlanc
On Wed, Aug 27, 2014 at 4:15 PM, Sage Weil wrote: > On Wed, 27 Aug 2014, Robert LeBlanc wrote: > > I'm looking for a way to prioritize the heartbeat traffic higher than the > > storage and replication traffic. I would like to keep the ceph.conf as > > simple as

Re: [ceph-users] Prioritize Heartbeat packets

2014-08-27 Thread Robert LeBlanc
Interesting concept. What if this was extended to an external message bus system like RabbitMQ, ZeroMQ, etc? Robert LeBlanc Sent from a mobile device please excuse any typos. On Aug 27, 2014 7:34 PM, "Matt W. Benjamin" wrote: > Hi, > > I wasn't thinking of an interface

Re: [ceph-users] Uneven OSD usage

2014-08-28 Thread Robert LeBlanc
How many PGs do you have in your pool? This should be about 100/OSD. If it is too low, you could get an imbalance. I don't know the consequence of changing it on such a full cluster. The default values are only good for small test environments. Robert LeBlanc Sent from a mobile device p

Re: [ceph-users] Questions regarding Crush Map

2014-09-02 Thread Robert LeBlanc
According to http://ceph.com/docs/master/rados/operations/crush-map/, you should be able to construct a clever use of 'step take' and 'step choose' rules in your CRUSH map to force one copy to a particular bucket and allow the other two copies to be chosen elsewhere. I was looking for a way to have

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Robert LeBlanc
ill be the best option, but it can still use some performance tweaking with small reads before it will be really viable for us. Robert LeBlanc On Thu, Sep 4, 2014 at 10:21 AM, Dan Van Der Ster wrote: > Dear Cephalopods, > > In a few weeks we will receive a batch of 200GB Intel DC S3700’s

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Robert LeBlanc
;t want to make any big changes until we have a better idea of what the future looks like. I think the Enterprise versions of Ceph (n-1 or n-2) will be a bit too old from where we want to be, which I'm sure will work wonderfully on Red Hat, but how will n.1, n.2 or n.3 run? Robert LeBlanc On T

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Robert LeBlanc
yet. Do you know if you can use an md RAID1 as a cache > dev? And is the graceful failover from wb to writethrough actually working > without data loss? > > Also, write behind sure would help the filestore, since I'm pretty sure > the same 4k blocks are being overwritten many t

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Robert LeBlanc
gh. Are the patches you talk about just backports from later kernels or something different? Robert LeBlanc On Thu, Sep 4, 2014 at 1:13 PM, Stefan Priebe wrote: > Hi Dan, hi Robert, > > Am 04.09.2014 21:09, schrieb Dan van der Ster: > > Thanks again for all of your input.

Re: [ceph-users] Bcache / Enhanceio with osds

2014-09-22 Thread Robert LeBlanc
We are still in the middle of testing things, but so far we have had more improvement with SSD journals than the OSD cached with bcache (five OSDs fronted by one SSD). We still have yet to test if adding a bcache layer in addition to the SSD journals provides any additional improvements. Robert

Re: [ceph-users] cold-storage tuning Ceph

2015-02-23 Thread Robert LeBlanc
Sorry this is delayed, catching up. I beleive this was talked about in the last Ceph summit. I think this was the blueprint. https://wiki.ceph.com/Planning/Blueprints/Hammer/Towards_Ceph_Cold_Storage On Wed, Jan 14, 2015 at 9:35 AM, Martin Millnert wrote: > Hello list, > > I'm currently trying to

Re: [ceph-users] OSD Startup Best Practice: gpt/udev or SysVInit/systemd ?

2015-02-24 Thread Robert LeBlanc
We have had good luck with letting udev do it's thing on CentOS7. On Wed, Feb 18, 2015 at 7:46 PM, Anthony Alba wrote: > Hi Cephers, > > What is your "best practice" for starting up OSDs? > > I am trying to determine the most robust technique on CentOS 7 where I > have too much choice: > > udev/g

Re: [ceph-users] Centos 7 OSD silently fail to start

2015-02-25 Thread Robert LeBlanc
We use ceph-disk without any issues on CentOS7. If you want to do a manual deployment, verfiy you aren't missing any steps in http://ceph.com/docs/master/install/manual-deployment/#long-form. On Tue, Feb 24, 2015 at 5:46 PM, Barclay Jameson wrote: > I have tried to install ceph using ceph-deploy

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-25 Thread Robert LeBlanc
would like to try it to offer some feedback on your question. Thanks, Robert LeBlanc On Wed, Feb 25, 2015 at 12:31 PM, Sage Weil wrote: > Hey, > > We are considering switching to civetweb (the embedded/standalone rgw web > server) as the primary supported RGW frontend instead of

Re: [ceph-users] Centos 7 OSD silently fail to start

2015-02-25 Thread Robert LeBlanc
I think that your problem lies with systemd (even though you are using SysV syntax, systemd is really doing the work). Systemd does not like multiple arguments and I think this is why it is failing. There is supposed to be some work done to get systemd working ok, but I think it has the limitation

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-25 Thread Robert LeBlanc
Cool, I'll see if we have some cycles to look at it. On Wed, Feb 25, 2015 at 2:49 PM, Sage Weil wrote: > On Wed, 25 Feb 2015, Robert LeBlanc wrote: >> We tried to get radosgw working with Apache + mod_fastcgi, but due to >> the changes in radosgw, Apache, mode_*cgi, etc a

[ceph-users] Clarification of SSD journals for BTRFS rotational HDD

2015-02-25 Thread Robert LeBlanc
s. Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Centos 7 OSD silently fail to start

2015-02-25 Thread Robert LeBlanc
teresting part is that "ceph-disk activate" apparently does it > correctly. Even after reboot, the services start as they should. > > On Wed, Feb 25, 2015 at 3:54 PM, Robert LeBlanc > wrote: >> >> I think that your problem lies with systemd (even though you are u

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Robert LeBlanc
Thanks, we were able to get it up and running very quickly. If it performs well, I don't see any reason to use Apache+fast_cgi. I don't have any problems just focusing on civetweb. On Wed, Feb 25, 2015 at 2:49 PM, Sage Weil wrote: > On Wed, 25 Feb 2015, Robert LeBlanc wrote: >&

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Robert LeBlanc
age- > From: ceph-devel-ow...@vger.kernel.org > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Robert LeBlanc > Sent: Thursday, February 26, 2015 12:27 PM > To: Sage Weil > Cc: Ceph-User; ceph-devel > Subject: Re: [ceph-users] who is using radosgw with civetweb? > &

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Robert LeBlanc
+1 for proxy. Keep the civetweb lean and mean and if people need "extras" let the proxy handle this. Proxies are easy to set-up and a simple example could be included in the documentation. On Thu, Feb 26, 2015 at 11:43 AM, Wido den Hollander wrote: > > >> Op 26 feb. 2015 om 18:22 heeft Sage Weil

Re: [ceph-users] old osds take much longer to start than newer osd

2015-02-27 Thread Robert LeBlanc
Does deleting/reformatting the old osds improve the performance? On Fri, Feb 27, 2015 at 6:02 AM, Corin Langosch wrote: > Hi guys, > > I'm using ceph for a long time now, since bobtail. I always upgraded every > few weeks/ months to the latest stable > release. Of course I also removed some osds

Re: [ceph-users] Clarification of SSD journals for BTRFS rotational HDD

2015-02-27 Thread Robert LeBlanc
Also sending to the devel list to see if they have some insight. On Wed, Feb 25, 2015 at 3:01 PM, Robert LeBlanc wrote: > I tried finding an answer to this on Google, but couldn't find it. > > Since BTRFS can parallel the journal with the write, does it make > sense to have t

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Robert LeBlanc
I would be inclined to shut down both OSDs in a node, let the cluster recover. Once it is recovered, shut down the next two, let it recover. Repeat until all the OSDs are taken out of the cluster. Then I would set nobackfill and norecover. Then remove the hosts/disks from the CRUSH then unset nobac

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Robert LeBlanc
If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that h

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-04 Thread Robert LeBlanc
t; Thanks for the tip of course ! > Andrija > > On 3 March 2015 at 18:34, Robert LeBlanc wrote: >> >> I would be inclined to shut down both OSDs in a node, let the cluster >> recover. Once it is recovered, shut down the next two, let it recover. >> Repeat until all t

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Robert LeBlanc
gt;> that are stoped (and cluster resynced after that) ? >> >> Thx again for the help >> >> On 4 March 2015 at 17:44, Robert LeBlanc wrote: >>> >>> If I remember right, someone has done this on a live cluster without >>> any issues. I seem to reme

Re: [ceph-users] Ceph User Teething Problems

2015-03-04 Thread Robert LeBlanc
I can't help much on the MDS front, but here is some answers and my view on some of it. On Wed, Mar 4, 2015 at 1:27 PM, Datatone Lists wrote: > I have been following ceph for a long time. I have yet to put it into > service, and I keep coming back as btrfs improves and ceph reaches > higher versi

Re: [ceph-users] Ceph User Teething Problems

2015-03-05 Thread Robert LeBlanc
David, You will need to up the limit of open files in the linux system. Check /etc/security/limits.conf. it is explained somewhere in the docs and the autostart scripts 'fixes' the issue for most people. When I did a manual deploy for the same reasons you are, I ran into this too. Robe

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-05 Thread Robert LeBlanc
Hi Robert, >> >> I already have this stuff set. CEph is 0.87.0 now... >> >> Thanks, will schedule this for weekend, 10G network and 36 OSDs - should >> move data in less than 8h per my last experineced that was arround8h, but >> some 1G OSDs were included..

Re: [ceph-users] Prioritize Heartbeat packets

2015-03-06 Thread Robert LeBlanc
I see that Jian Wen has done work on this for 0.94. I tried looking through the code to see if I can figure out how to configure this new option, but it all went over my head pretty quick. Can I get a brief summary on how to set the priority of heartbeat packets or where to look in the code to fig

[ceph-users] Fwd: Prioritize Heartbeat packets

2015-03-06 Thread Robert LeBlanc
Hidden HTML ... trying agin... -- Forwarded message -- From: Robert LeBlanc Date: Fri, Mar 6, 2015 at 5:20 PM Subject: Re: [ceph-users] Prioritize Heartbeat packets To: "ceph-users@lists.ceph.com" , ceph-devel I see that Jian Wen has done work on this for 0.94. I tri

Re: [ceph-users] Prioritize Heartbeat packets

2015-03-09 Thread Robert LeBlanc
e commit, this ought to do the trick: > > osd heartbeat use min delay socket = true > > On 07/03/15 01:20, Robert LeBlanc wrote: >> >> I see that Jian Wen has done work on this for 0.94. I tried looking >> through the code to see if I can figure out how to configure th

Re: [ceph-users] Prioritize Heartbeat packets

2015-03-09 Thread Robert LeBlanc
, 2015 at 3:36 AM, Robert LeBlanc wrote: >> I've found commit 9b9a682fe035c985e416ee1c112fa58f9045a27c and I see >> that when 'osd heartbeat use min delay socket = true' it will mark the >> packet with DSCP CS6. Based on the setting of the socket in >> msg/simple/Pi

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
ed? I don't remember if you said you checked it. Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 11, 2015 8:08 PM, "Jesus Chavez (jeschave)" wrote: > Thanks Steffen I have followed everything not sure what is going on, the > mon keyring and client adm

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
205267%203146>* > Mobile: *+51 1 5538883255 <+51%201%205538883255>* > > CCIE - 44433 > > On Mar 12, 2015, at 7:54 AM, Robert LeBlanc wrote: > > If I remember right, the mon key has to be the same between all the mon > hosts. I don't think I added an admin k

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
5538883255 <+51%201%205538883255>* > > CCIE - 44433 > > On Mar 12, 2015, at 10:06 AM, Robert LeBlanc wrote: > > Add the new monitor to the Monitor map. > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
tion Information. > > > > > On Mar 12, 2015, at 10:33 AM, Jesus Chavez (jeschave) > wrote: > > Great :) so just 1 point more, step 4 in adding monitors (Add the > new monitor to the Monitor map.) this command actually runs in the new > monitor right? > > Thank you so much! > > > * Jesus Chavez* > SYSTEMS ENGINEER-C.SALES > > jesch...@cisco.com > Phone: *+52 55 5267 3146 <+52%2055%205267%203146>* > Mobile: *+51 1 5538883255 <+51%201%205538883255>* > > CCIE - 44433 > > On Mar 12, 2015, at 10:06 AM, Robert LeBlanc wrote: > > Add the new monitor to the Monitor map. > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] osd replication

2015-03-12 Thread Robert LeBlanc
The primary OSD for an object is responsible for the replication. In a healthy cluster the workflow is as such: 1. Client looks up primary OSD in CRUSH map 2. Client sends object to be written to primary OSD 3. Primary OSD looks up replication OSD(s) in its CRUSH map 4. Primary OSD con

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
I'm not sure why you are having such a hard time. I added monitors (and removed them) on CentOS 7 by following what I had. The thing that kept tripping me up was firewalld. Once I either shut it off or created a service for Ceph, it worked fine. What is in in /var/log/ceph/ceph-mon.tauro.log when

Re: [ceph-users] Add monitor unsuccesful

2015-03-12 Thread Robert LeBlanc
trying to figure out :(. Thank you so much! > > > * Jesus Chavez* > SYSTEMS ENGINEER-C.SALES > > jesch...@cisco.com > Phone: *+52 55 5267 3146 <+52%2055%205267%203146>* > Mobile: *+51 1 5538883255 <+51%201%205538883255>* > > CCIE - 44433 > > On Mar 12

Re: [ceph-users] Strange Monitor Appearance after Update

2015-03-12 Thread Robert LeBlanc
Two monitors don't work very well and really don't but you anything. I would either add another monitor or remove one. Paxos is most effective with an odd number of monitors. I don't know about the problem you are experiencing and how to help you. An even number of monitors shoul

Re: [ceph-users] Strange Monitor Appearance after Update

2015-03-12 Thread Robert LeBlanc
Having two monitors should not be causing the problem you are seeing like you say. What is in /var/log/ceph/ceph.mon.*.log? Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 12, 2015 7:39 PM, "Georgios Dimitrakakis" wrote: > Hi Robert! > > Thanks for

Re: [ceph-users] OSD booting down

2015-03-12 Thread Robert LeBlanc
n run ceph-disk activate. Ceph-disk is just a script so you can open it up and take a look. So I guess it depends on which automatically you want to happen. Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 12, 2015 9:54 PM, "Jesus Chavez (jeschave)" wrote: >

Re: [ceph-users] Replication question

2015-03-13 Thread Robert LeBlanc
That is correct, you make a tradeoff between space, performance and resiliency. By reducing replication from 3 to 2, you will get more space and likely more performance (less overhead from third copy), but it comes at the expense of being able to recover your data when there are multiple failures.

Re: [ceph-users] Ceph + Infiniband CLUS & PUB Network

2015-03-17 Thread Robert LeBlanc
We have a test cluster with IB. We have both networks over IPoIB on the same IP subnet though (no cluster network configuration). On Tue, Mar 17, 2015 at 12:02 PM, German Anders wrote: > Hi All, > > Does anyone have Ceph implemented with Infiniband for Cluster and > Public network? > > Th

Re: [ceph-users] Ceph + Infiniband CLUS & PUB Network

2015-03-17 Thread Robert LeBlanc
> > Any help will really be appreciated. > > Thanks in advance, > > > > *German Anders* > > Storage System Engineer Leader > > *Despegar* | IT Team > > *office* +54 11 4894 3500 x3408 > > *mobile* +54 911 3493 7262 > *mail* gand...@despegar.com > &

Re: [ceph-users] Mapping OSD to physical device

2015-03-19 Thread Robert LeBlanc
Udev already provides some of this for you. Look in /dev/disk/by-*. You can reference drives by UUID, id or path (for SAS/SCSI/FC/iSCSI/etc) which will provide some consistency across reboots and hardware changes. On Thu, Mar 19, 2015 at 1:10 PM, Colin Corr wrote: > Greetings Cephers, > > I have

Re: [ceph-users] Mapping OSD to physical device

2015-03-19 Thread Robert LeBlanc
cally finds the volume with udev, mounts it in the correct location and accesses the journal on the right disk. It also may be a limitation on the version of ceph-deploy/ceph-disk you are using. On Thu, Mar 19, 2015 at 5:54 PM, Colin Corr wrote: > On 03/19/2015 12:27 PM, Robert LeBlanc wrote:

Re: [ceph-users] Server Specific Pools

2015-03-20 Thread Robert LeBlanc
You can create CRUSH rulesets and then assign pools to different rulesets. http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-different-osds On Thu, Mar 19, 2015 at 7:28 PM, Garg, Pankaj wrote: > Hi, > > > > I have a Ceph cluster with both ARM and x86 based server

Re: [ceph-users] Fwd: OSD Forece Removal

2015-03-20 Thread Robert LeBlanc
Removing the OSD from the CRUSH map and deleting the auth key is how you force remove an OSD. The OSD can no longer participate in the cluster, even if it does come back to life. All clients forget about the OSD when the new CRUSH map is distributed. On Fri, Mar 20, 2015 at 11:19 AM, Jesus Chavez

Re: [ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-20 Thread Robert LeBlanc
We tested bcache and abandoned it for two reasons. 1. Didn't give us any better performance than journals on SSD. 2. We had lots of corruption of the OSDs and were rebuilding them frequently. Since removing them, the OSDs have been much more stable. On Fri, Mar 20, 2015 at 4:03 AM, Nick

Re: [ceph-users] PGs issue

2015-03-20 Thread Robert LeBlanc
The weight can be based on anything, size, speed, capability, some random value, etc. The important thing is that it makes sense to you and that you are consistent. Ceph by default (ceph-disk and I believe ceph-deploy) take the approach of using size. So if you use a different weighting scheme, yo

Re: [ceph-users] PGs issue

2015-03-20 Thread Robert LeBlanc
> This seems to be a fairly consistent problem for new users. > > The create-or-move is adjusting the crush weight, not the osd weight. > Perhaps the init script should set the defaultweight to 0.01 if it's <= 0? > > It seems like there's a downside to this, but I don&

Re: [ceph-users] Fwd: OSD Forece Removal

2015-03-20 Thread Robert LeBlanc
> jesch...@cisco.com > Phone: *+52 55 5267 3146 <+52%2055%205267%203146>* > Mobile: *+51 1 5538883255 <+51%201%205538883255>* > > CCIE - 44433 > > On Mar 20, 2015, at 2:21 PM, Robert LeBlanc wrote: > > Removing the OSD from the CRUSH map and deleting the

Re: [ceph-users] OSD Forece Removal

2015-03-20 Thread Robert LeBlanc
Yes, at this point, I'd export the CRUSH, edit it and import it back in. What version are you running? Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 20, 2015 4:28 PM, "Jesus Chavez (jeschave)" wrote: > thats what you sayd? > > [root@capri

Re: [ceph-users] Replacing a failed OSD disk drive (or replace XFS with BTRFS)

2015-03-21 Thread Robert LeBlanc
a on it we were able to format 40 OSDs in under 30 minutes (we formatted a while host at a time because we knew that was safe ) with a few little online scripts. Short answer is don't be afraid to do it this way. Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 2

Re: [ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-23 Thread Robert LeBlanc
I don't have a fresh cluster on hand to double check, but the default is to select a different host for each replica. You can adjust that to fit your needs, we are using cabinet as the selection criteria so that we can lose an entire cabinet of storage and still function. In order to store multipl

[ceph-users] CRUSH decompile failes

2015-03-23 Thread Robert LeBlanc
I was trying to decompile and edit the CRUSH map to adjust the CRUSH rules. My first attempt created a map that would decompile, but I could not recompile the CRUSH even if didn't modify it. When trying to download the CRUSH fresh, now the decompile fails. [root@nodezz ~]# ceph osd getmap -o map.c

Re: [ceph-users] CRUSH decompile failes

2015-03-23 Thread Robert LeBlanc
27; For some reason it doesn't like the rack definition. I can move things around, like putting root before it and it always chokes on the first rack definition no matter which one it is. On Mon, Mar 23, 2015 at 12:53 PM, Robert LeBlanc wrote: > I was trying to decompile and edit the C

Re: [ceph-users] CRUSH decompile failes

2015-03-23 Thread Robert LeBlanc
which we are on). Saving for posterity's sake. Thanks Sage! On Mon, Mar 23, 2015 at 1:09 PM, Robert LeBlanc wrote: > Ok, so the decompile error is because I didn't download the CRUSH map > (found that out using hexdump), but I still can't compile an > unmodified CRUSH

Re: [ceph-users] CRUSH Map Adjustment for Node Replication

2015-03-23 Thread Robert LeBlanc
You just need to change your rule from step chooseleaf firstn 0 type osd to step chooseleaf firstn 0 type host There will be data movement as it will want to move about half the objects to the new host. There will be data generation as you move from size 1 to size 2. As far as I know a deep scr

Re: [ceph-users] CRUSH Map Adjustment for Node Replication

2015-03-23 Thread Robert LeBlanc
itrakakis Georgios wrote: > Robert thanks for the info! > > How can I find out and modify when is scheduled the next deep scrub, > the number of backfill processes and their priority? > > Best regards, > > George > > > > Ο χρήστης Robert LeBlanc έγραψε

[ceph-users] Does crushtool --test --simulate do what cluster should do?

2015-03-23 Thread Robert LeBlanc
I'm trying to create a CRUSH ruleset and I'm using crushtool to test the rules, but it doesn't seem to mapping things correctly. I have two roots, on for spindles and another for SSD. I have two rules, one for each root. The output of crushtool on rule 0 shows objects being mapped to SSD OSDs when

Re: [ceph-users] Write IO Problem

2015-03-24 Thread Robert LeBlanc
although we haven't had show stopping issues with BTRFS, we are still going to start on XFS. Our plan is to build a cluster as a target for our backup system and we will put BTRFS on that to prove it in a production setting. Robert LeBlanc Sent from a mobile device please excuse any typos. On M

Re: [ceph-users] error creating image in rbd-erasure-pool

2015-03-24 Thread Robert LeBlanc
Is there an enumerated list of issues with snapshots on cache pools. We currently have snapshots on a cache tier and haven't seen any issues (development cluster). I just want to know what we should be looking for. On Tue, Mar 24, 2015 at 9:21 AM, Stéphane DUGRAVOT wrote: > > > __

Re: [ceph-users] Does crushtool --test --simulate do what cluster should do?

2015-03-24 Thread Robert LeBlanc
Mar 23, 2015 at 6:08 PM, Robert LeBlanc wrote: > I'm trying to create a CRUSH ruleset and I'm using crushtool to test > the rules, but it doesn't seem to mapping things correctly. I have two > roots, on for spindles and another for SSD. I have two rules, one for > ea

Re: [ceph-users] Does crushtool --test --simulate do what cluster should do?

2015-03-24 Thread Robert LeBlanc
http://tracker.ceph.com/issues/11224 On Tue, Mar 24, 2015 at 12:11 PM, Gregory Farnum wrote: > On Tue, Mar 24, 2015 at 10:48 AM, Robert LeBlanc wrote: >> I'm not sure why crushtool --test --simulate doesn't match what the >> cluster actually does, but the cluster seems

Re: [ceph-users] ERROR: missing keyring, cannot use cephx for authentication

2015-03-25 Thread Robert LeBlanc
It doesn't look like your OSD is mounted. What do you have when you run mount? How did you create your OSDs? Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 25, 2015 1:31 AM, "oyym...@gmail.com" wrote: > Hi,Jesus > I encountered similar problem. &g

Re: [ceph-users] New deployment: errors starting OSDs: "invalid (someone else's?) journal"

2015-03-25 Thread Robert LeBlanc
I don't know much about ceph-deploy, but I know that ceph-disk has problems "automatically" adding an SSD OSD when there are journals of other disks already on it. I've had to partition the disk ahead of time and pass in the partitions to make ceph-disk work. Also, unless you are sure that the de

Re: [ceph-users] New deployment: errors starting OSDs: "invalid (someone else's?) journal"

2015-03-25 Thread Robert LeBlanc
wrote: > On Wed, Mar 25, 2015 at 6:06 PM, Robert LeBlanc wrote: >> I don't know much about ceph-deploy, but I know that ceph-disk has >> problems "automatically" adding an SSD OSD when there are journals of >> other disks already on it. I've had to partition

Re: [ceph-users] Migrating objects from one pool to another?

2015-03-26 Thread Robert LeBlanc
down, create a new snapshot on the new pool, point the VM to that and then flatten the RBD. Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 26, 2015 5:23 PM, "Steffen W Sørensen" wrote: > > On 26/03/2015, at 23.13, Gregory Farnum wrote: > > The procedure

[ceph-users] Where is the systemd files?

2015-03-26 Thread Robert LeBlanc
86_64 libcephfs1-0.93-0.el7.centos.x86_64 ceph-0.93-0.el7.centos.x86_64 ceph-deploy-1.5.22-0.noarch [ulhglive-root@mon1 systemd]# for i in $(rpm -qa | grep ceph); do rpm -ql $i | grep -i --color=always systemd; done [nothing returned] Thanks, Robert Le

[ceph-users] 0.93 fresh cluster won't create PGs

2015-03-27 Thread Robert LeBlanc
ount of time causing the OSDs to overrun a journal or something (I know that Ceph journals pgmap changes and such). I'm concerned that this could be very detrimental in a production environment. There doesn't seem to be a way to recover from this. Any thoughts? Thanks, Robert LeBlanc ___

Re: [ceph-users] 0.93 fresh cluster won't create PGs

2015-03-27 Thread Robert LeBlanc
Thanks, we'll give the gitbuilder packages a shot and report back. Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 27, 2015 10:03 PM, "Sage Weil" wrote: > On Fri, 27 Mar 2015, Robert LeBlanc wrote: > > I've built Ceph clusters a few times n

[ceph-users] Force an OSD to try to peer

2015-03-30 Thread Robert LeBlanc
I've been working at this peering problem all day. I've done a lot of testing at the network layer and I just don't believe that we have a problem that would prevent OSDs from peering. When looking though osd_debug 20/20 logs, it just doesn't look like the OSDs are trying to peer. I don't know if i

[ceph-users] Fwd: Force an OSD to try to peer

2015-03-30 Thread Robert LeBlanc
Sorry HTML snuck in somewhere. -- Forwarded message -- From: Robert LeBlanc Date: Mon, Mar 30, 2015 at 8:15 PM Subject: Force an OSD to try to peer To: Ceph-User , ceph-devel I've been working at this peering problem all day. I've done a lot of testing at the network

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
Turns out jumbo frames was not set on all the switch ports. Once that was resolved the cluster quickly became healthy. On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc wrote: > I've been working at this peering problem all day. I've done a lot of > testing at the network layer

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
I was desperate for anything after exhausting every other possibility I could think of. Maybe I should put a checklist in the Ceph docs of things to look for. Thanks, On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil wrote: > On Tue, 31 Mar 2015, Robert LeBlanc wrote: >> Turns out jumbo f

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
ng ? > In our setup so far, we haven't enabled jumbo frames other than performance > reason (if at all). > > Thanks & Regards > Somnath > > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Robert LeBlanc > Se

[ceph-users] What are you doing to locate performance issues in a Ceph cluster?

2015-04-06 Thread Robert LeBlanc
re others doing to locate performance issues in their Ceph clusters? Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-13 Thread Robert LeBlanc
I really like this proposal. On Mon, Apr 13, 2015 at 2:33 AM, Joao Eduardo Luis wrote: > On 04/13/2015 02:25 AM, Christian Balzer wrote: >> On Sun, 12 Apr 2015 14:37:56 -0700 Gregory Farnum wrote: >> >>> On Sun, Apr 12, 2015 at 1:58 PM, Francois Lafont >>> wrote: Somnath Roy wrote: >>>

Re: [ceph-users] low power single disk nodes

2015-04-13 Thread Robert LeBlanc
We are getting ready to put the Quantas into production. We looked at the Supermico Atoms (we have 6 of them), the rails were crap (they exploded the first time you pull the server out, and they stick out of the back of the cabinet about 8 inches, these boxes are already very deep), we also ran out

Re: [ceph-users] Network redundancy pro and cons, best practice, suggestions?

2015-04-13 Thread Robert LeBlanc
For us, using two 40Gb ports with VLANs is redundancy enough. We are doing LACP over two different switches. On Mon, Apr 13, 2015 at 3:03 AM, Götz Reinicke - IT Koordinator wrote: > Dear ceph users, > > we are planing a ceph storage cluster from scratch. Might be up to 1 PB > within the next 3 ye

Re: [ceph-users] low power single disk nodes

2015-04-13 Thread Robert LeBlanc
Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Robert LeBlanc >> Sent: 13 April 2015 17:27 >> To: Jerker Nyberg >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] low power single disk nodes >> >> We ar

[ceph-users] norecover and nobackfill

2015-04-13 Thread Robert LeBlanc
I'm looking for documentation about what exactly each of these do and I can't find it. Can someone point me in the right direction? The names seem too ambiguous to come to any conclusion about what exactly they do. Thanks, Robert ___ ceph-users mailing

Re: [ceph-users] norecover and nobackfill

2015-04-13 Thread Robert LeBlanc
active+undersized+degraded+remapped+backfilling 1 active+recovery_wait+undersized+degraded+remapped client io 1864 kB/s rd, 8853 kB/s wr, 65 op/s Any help understanding these flags would be very helpful. Thanks, Robert On Mon, Apr 13, 2015 at 1:40 PM, Robert LeBlanc wrote: >

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Robert LeBlanc
CRUSH map. That should work, I'll test it on my cluster. I'd still like to know the difference between norecover and nobackfill if anyone knows. On Mon, Apr 13, 2015 at 7:40 PM, Francois Lafont wrote: > Hi, > > Robert LeBlanc wrote: > > > What I'm trying to achiev

Re: [ceph-users] norecover and nobackfill

2015-04-14 Thread Robert LeBlanc
OK, I remember now, if I don't remove the OSD from the CRUSH, ceph-disk will get a new OSD ID and the old one will hang around as a zombie. This will change the host/rack/etc weights causing cluster wide rebalance. On Tue, Apr 14, 2015 at 9:31 AM, Robert LeBlanc wrote: > HmmmI

Re: [ceph-users] Ceph repo - RSYNC?

2015-04-15 Thread Robert LeBlanc
http://eu.ceph.com/ has rsync and Hammer. On Wed, Apr 15, 2015 at 10:17 AM, Paul Mansfield < paul.mansfi...@alcatel-lucent.com> wrote: > > Sorry for starting a new thread, I've only just subscribed to the list > and the archive on the mail listserv is far from complete at the moment. > > on 8th M

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Robert LeBlanc
Delete and re-add all six OSDs. On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic wrote: > Hi guys, > > I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, > ceph rebalanced etc. > > Now I have new SSD inside, and I will partition it etc - but would like to > know, how to procee

  1   2   3   4   5   >