Re: [ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Robert LeBlanc
If the journal file on the osd is a symlink to the partition and the OSD process is running, then the journal was created properly. The OSD would not start if the journal was not created. On Fri, Apr 17, 2015 at 2:43 PM, Andrija Panic wrote: > Hi all, > > when I run: > > ceph-deploy osd create S

Re: [ceph-users] OSDs failing on upgrade from Giant to Hammer

2015-04-19 Thread Robert LeBlanc
Did you upgrade from 0.92? If you did, did you flush the logs before upgrading? On Sun, Apr 19, 2015 at 1:02 PM, Scott Laird wrote: > I'm upgrading from Giant to Hammer (0.94.1), and I'm seeing a ton of OSDs > die (and stay dead) with this error in the logs: > > 2015-04-19 11:53:36.796847 7f61fa

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Robert LeBlanc
We have a similar issue, but we wanted three copies across two racks. Turns out, that we increased size to 4 and left min_size at 2. We didn't want to risk having less than two copies and if we only had thee copies, losing a rack would block I/O. Once we expand to a third rack, we will adjust our r

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Robert LeBlanc
On Mon, Apr 20, 2015 at 2:34 PM, Colin Corr wrote: > > > On 04/20/2015 11:02 AM, Robert LeBlanc wrote: > > We have a similar issue, but we wanted three copies across two racks. > Turns out, that we increased size to 4 and left min_size at 2. We didn't > want to risk h

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Robert LeBlanc
til the objects have been replicated elsewhere in the rack, but it would not be more than 2 copies. I hope I'm making sense and this my jabbering is useful. On Mon, Apr 20, 2015 at 4:08 PM, Colin Corr wrote: > > > On 04/20/2015 01:46 PM, Robert LeBlanc wrote: > > > &

Re: [ceph-users] CephFS development since Firefly

2015-04-20 Thread Robert LeBlanc
Thanks for all your hard work on CephFS. This progress is very exciting to hear about. I am constantly amazed at the amount of work that gets done in Ceph in so short an amount of time. On Mon, Apr 20, 2015 at 6:26 PM, Gregory Farnum wrote: > We’ve been hard at work on CephFS over the last year

Re: [ceph-users] CephFS concurrency question

2015-04-21 Thread Robert LeBlanc
I think your are using an old version of OpenStack. I seem to remember a discussion about a patch to remove the requirement of shared storage for live migration on Ceph RBD. Are you using librbd in open stack? Robert LeBlanc Sent from a mobile device please excuse any typos. On Apr 21, 2015 6:43

Re: [ceph-users] Still CRUSH problems with 0.94.1 ?

2015-04-21 Thread Robert LeBlanc
We had a very similar problem, but was repeatable on Firefly as well. For us, it turns out the MTU on the switches were not all configured for 9000 byte frames. This prevented the peering process to complete and as data was added, and such got worse. Here is a section I wrote for our internal docu

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-21 Thread Robert LeBlanc
y written for). On Tue, Apr 21, 2015 at 8:36 AM, Colin Corr wrote: > > > On 04/20/2015 04:18 PM, Robert LeBlanc wrote: > > You usually won't end up with more than the "size" number of replicas, > even in a failure situation. Although technically more than "siz

Re: [ceph-users] CEPH 1 pgs incomplete

2015-04-22 Thread Robert LeBlanc
Look through the output of 'ceph pg 0.37 query' and see if it gives you any hints on where to look. On Wed, Apr 22, 2015 at 2:57 AM, MEGATEL / Rafał Gawron < rafal.gaw...@megatel.com.pl> wrote: > Hi > > I have problem with 1 pg incomplete > > ceph -w > > cluster 9cb96455-4d70-48e7-8517-8fd94

Re: [ceph-users] ceph-disk activate hangs with external journal device

2015-04-22 Thread Robert LeBlanc
I believe your problem is that you haven't created bootstrap-osd key and distributed it to your OSD node in /var/lib/ceph/bootstrap-osd/. On Wed, Apr 22, 2015 at 5:41 AM, Daniel Piddock wrote: > Hi, > > I'm a ceph newbie setting up some trial installs for evaluation. > > Using Debian stable (Wh

Re: [ceph-users] Still CRUSH problems with 0.94.1 ? (explained)

2015-04-22 Thread Robert LeBlanc
The Ceph process should be listening to the IP address and port, not the physical NIC. I haven't encountered a problem like this, but it is good to know it may exist. You may want to tweak miimon, downdelay and updelay (I have a feeling that these won't really help you much as it seems the link was

Re: [ceph-users] ceph-disk activate hangs with external journal device

2015-04-23 Thread Robert LeBlanc
re. I don't think the first stay up should be trying to create file systems, that should have been done with prepare. Robert LeBlanc Sent from a mobile device please excuse any typos. On Apr 23, 2015 3:43 AM, "Daniel Piddock" wrote: > On 22/04/15 20:32, Robert LeBlanc wrote: >

Re: [ceph-users] Accidentally Remove OSDs

2015-04-23 Thread Robert LeBlanc
A full CRUSH dump would be helpful, as well as knowing which OSDs you took out. If you didn't take 17 out as well as 15, then you might be OK. If the OSDs still show up in your CRUSH, then try and remove them from the CRSH map with 'ceph osd crush rm osd.15'. If you took out both OSDs, you will ne

Re: [ceph-users] Another OSD Crush question.

2015-04-23 Thread Robert LeBlanc
If you force CRUSH to put copies in each rack, then you will be limited by the smallest rack. You can have some sever limitations if you try to keep your copies to two racks (see the thread titles "CRUSH rule for 3 replicas across 2 hosts") for some of my explanation about this. If I were you, I w

Re: [ceph-users] Accidentally Remove OSDs

2015-04-23 Thread Robert LeBlanc
ng the tools, I'd be inclined to delete the pool and restore from back up as to not be surprised by data corruption in the images. Neither option is ideal or quick. Robert LeBlanc Sent from a mobile device please excuse any typos. On Apr 23, 2015 6:42 PM, "FaHui Lin" wrote:

Re: [ceph-users] Serving multiple applications with a single cluster

2015-04-23 Thread Robert LeBlanc
You could map the RBD to each host and put a cluster file system like OCFS2 on it so all cluster nodes can read and write at the same time. If these are VMs, then you can present the RBD in libvirt and the root user would not have access to mount other RBD in the same pool. Robert LeBlanc Sent

Re: [ceph-users] Having trouble getting good performance

2015-04-24 Thread Robert LeBlanc
The client ACKs the write as soon as it is in the journal. I suspect that the primary OSD dispatches the write to all the secondary OSDs at the same time so that it happens in parallel, but I am not an authority on that. The journal writes data serially even if it comes in randomly. There is some

Re: [ceph-users] Ceph recovery network?

2015-04-26 Thread Robert LeBlanc
My understanding is that Monitors monitor the public address of the OSDs and other OSDs monitor the cluster address of the OSDs. Replication, recovery and backfill traffic all use the same network when you specify 'cluster network = ' in your ceph.conf. It is useful to remember that replication, re

Re: [ceph-users] RBD storage pool support in Libvirt not enabled on CentOS

2015-04-29 Thread Robert LeBlanc
We have had to build our own QEMU. On Wed, Apr 29, 2015 at 4:34 AM, Wido den Hollander wrote: > Hi, > > While working with some CentOS machines I found out that Libvirt > currently is not build with RBD storage pool support. > > While that support has been upstream for a very long time and enabl

Re: [ceph-users] Change osd nearfull and full ratio of a running cluster

2015-04-29 Thread Robert LeBlanc
ceph tell mon.* injectargs "--mon_osd_full_ratio .97" ceph tell mon.* injectargs "--mon_osd_nearfull_ratio .92" On Wed, Apr 29, 2015 at 7:38 AM, Stefan Priebe - Profihost AG < s.pri...@profihost.ag> wrote: > Hi, > > how can i change the osd full and osd nearfull ratio of a running cluster? > > Ju

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-29 Thread Robert LeBlanc
The only way I know to actually extend the reserved space it using the method described here: https://www.thomas-krenn.com/en/wiki/SSD_Over-provisioning_using_hdparm On Wed, Apr 29, 2015 at 12:12 PM, Lionel Bouton wrote: > Hi Dominik, > > On 04/29/15 19:06, Dominik Hannen wrote: >> I had plann

Re: [ceph-users] Quick question - version query

2015-05-01 Thread Robert LeBlanc
ceph --admin-daemon version On Fri, May 1, 2015 at 10:44 AM, Tony Harris wrote: > Hi all, > > I feel a bit like an idiot at the moment - I know there is a command through > ceph to query the monitor and OSD daemons to check their version level, but > I can't remember what it is to save my life a

Re: [ceph-users] Rename or Remove Pool

2015-05-05 Thread Robert LeBlanc
Can you try ceph osd pool rename " " new-name On Tue, May 5, 2015 at 12:43 PM, Georgios Dimitrakakis wrote: > > Hi all! > > Somehow I have a pool without a name... > > $ ceph osd lspools > 3 data,4 metadata,5 rbd,6 .rgw,7 .rgw.control,8 .rgw.gc,9 .log,10 > .intent-log,11 .usage,12 .users,13 .u

Re: [ceph-users] ceph auth get-or-create not taking key from input file?

2015-05-06 Thread Robert LeBlanc
According to the help for get-or-create, it looks like it should take an input file. I've only ever used ceph auth import in this regard. I would file a bug report on get-or-create. On Wed, May 6, 2015 at 8:36 AM, Sergio A. de Carvalho Jr. < scarvalh...@gmail.com> wrote: > Hi, > > While creating

Re: [ceph-users] OSD in ceph.conf

2015-05-06 Thread Robert LeBlanc
We don't have OSD entries in our Ceph config. They are not needed if you don't have specific configs for different OSDs. Robert LeBlanc Sent from a mobile device please excuse any typos. On May 6, 2015 7:18 PM, "Florent MONTHEL" wrote: > Hi teqm, > > Is it neces

Re: [ceph-users] OSD in ceph.conf

2015-05-07 Thread Robert LeBlanc
correct location. There is no need for fstab entries or the like. This allows you to easily move OSD disks between servers (if you take the journals with it). It's magic! But I think I just gave away the secret. Robert LeBlanc Sent from a mobile device please excuse any typos. On May 7, 2015

Re: [ceph-users] Find out the location of OSD Journal

2015-05-07 Thread Robert LeBlanc
You may also be able to use `ceph-disk list`. On Thu, May 7, 2015 at 3:56 AM, Francois Lafont wrote: > Hi, > > Patrik Plank wrote: > > > i cant remember on which drive I install which OSD journal :-|| > > Is there any command to show this? > > It's probably not the answer you hope, but why don't

Re: [ceph-users] How to backup hundreds or thousands of TB

2015-05-07 Thread Robert LeBlanc
On Thu, May 7, 2015 at 5:20 AM, Wido den Hollander wrote: > > Aren't snapshots something that should protect you against removal? IF > snapshots work properly in CephFS you could create a snapshot every hour. > > Unless the file is created and removed between snapshots, then the Recycle Bin featu

Re: [ceph-users] OSD in ceph.conf

2015-05-11 Thread Robert LeBlanc
manage so many disks and mount points, but it is much easier than I anticipated once I used ceph-disk. Robert LeBlanc Sent from a mobile device please excuse any typos. On May 11, 2015 5:32 AM, "Georgios Dimitrakakis" wrote: > Hi Robert, > > just to make sure I got it correctly:

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Robert LeBlanc
ould come up with an error. The idea is to keep rsyncing until the deep-scrub is clean. Be warned that you may be aiming your gun at your foot with this! -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, May 11, 2015 at 2:09 AM,

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Robert LeBlanc
vgWcR+WoMV+yWjykKqZUCBdkMNu4ZC6M8tVFynbqyqxMof RqsFbVRm+6quen5UiWTJoZvxVJjT5iSmd52RZ06n3e7II5VQwpFgr2qDzW+A spN8nLBzJFKnUbNMqPXvutaf/458yW2bE3ARgs5g+DQjeKy1Nj7Cu5n5H135 Y0ry =xvWE -END PGP SIGNATURE- -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 12, 2015

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

2015-05-13 Thread Robert LeBlanc
ngineers scratching their heads for years, never to be fixed. - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, May 13, 2015 at 10:29 AM, Vasiliy Angapov wrote: Thanks, Sage! In the meanwhile I asked the same question in #Ceph IRC c

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

2015-05-13 Thread Robert LeBlanc
4P96VNrUF5ZNzFJLkDlq i/SiI9+beGNcYV72/hiJrp/zIqjtgdJfxLAfDGVocvDe4qHbtF9V7oEPB+Gh 7YP4 =c0TU -END PGP SIGNATURE- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, May 13, 2015 at 2:32 PM, Vasiliy Angapov wrote: > Robert, thank y

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

2015-05-14 Thread Robert LeBlanc
+2Rouz4f8uu5irmDjz0ivKL6QCIzBwZbBTdLIwqhf9vCl1ACDWq4U3 DMorcafZbMArdOqlkVhQJiMioZEQ8U/ThY2bInkNdhii/2A35CToyOfMKyfq FLAK5lCiM6gRfCkEBPTwkDR6GNAfgY7khz34adsBRlZPB6a3MeucAGtTjyWt AJIV =bcYd -END PGP SIGNATURE- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, May 14, 2015

Re: [ceph-users] QEMU Venom Vulnerability

2015-05-19 Thread Robert LeBlanc
r a couple of packages. - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 19, 2015 at 12:33 PM, Georgios Dimitrakakis wrote: > I am trying to build the packages manually and I was wondering > is the flag --enable-rbd enough t

Re: [ceph-users] QEMU Venom Vulnerability

2015-05-19 Thread Robert LeBlanc
icial RPMs As to where to find the SRPMs, I'm not really sure, I come from a Debian background where access to source packages is really easy. - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 19, 2015 at 3:47 PM, Georgios Dimitraka

Re: [ceph-users] Three tier cache setup

2015-05-21 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Not supported at the moment, but it is in the eventual plans and I think some of the code has been written such that it will help facilitate the development. - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2

Re: [ceph-users] Replacing OSD disks with SSD journal - journal disk space use

2015-05-26 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What version of Ceph are you using? I seem to remember an enhancement of ceph-disk for Hammer that is more aggressive in reusing previous partition. - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] Blocked requests/ops?

2015-05-26 Thread Robert LeBlanc
. It seems like there was some discussion about this on the devel list recently. - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 26, 2015 at 4:06 AM, Xavier Serrano wrote: > Hello, > > Thanks for your detailed explanation,

Re: [ceph-users] Performance and CPU load on HP servers running ceph (DL380 G6, should apply to others too)

2015-05-26 Thread Robert LeBlanc
CNX14kKE cDS/nLnKQKM2ehh2TLbFomZvFk4XXx4+ri/7A1vlqbisa+iedxaNrqG+wNY0 Q5fT44s+JTWZxCcxadhka0HQ4tguEvTXg83D/PGAkjo6BX7avUv5uiyDGMOw HBWS9/Cy98EH2gDKvWOq2DMDiSvIY+aLZ5W8tFX+D+rsE6DwrvN8fUDxyG6C dsN2 =u7g4 -END PGP SIGNATURE- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD

[ceph-users] Memory Allocators and Ceph

2015-05-27 Thread Robert LeBlanc
et.us/spreadsheets/d/1n12IqAOuH2wH-A7Sq5boU8kSEYg_Pl20sPmM0idjj00/edit?usp=sharing Test script is multitest. The real world test is based off of the disk stats of about 100 of our servers which have uptimes of many months. - - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E

Re: [ceph-users] Memory Allocators and Ceph

2015-05-27 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 The workload is on average, 17KB per read request and 13KB per write request with 73% read abd 27% write. This is a web hosting workload. - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed

Re: [ceph-users] Memory Allocators and Ceph

2015-05-27 Thread Robert LeBlanc
data/benefits much faster. > > Still, excellent testing! We definitely need more of this so we can > determine if jemalloc is something that would be worth switching to > eventually. > > - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C7

Re: [ceph-users] Memory Allocators and Ceph

2015-05-28 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I've got some more tests running right now. Once those are done, I'll find a couple of tests that had extreme difference and gather some perf data for them. - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E

Re: [ceph-users] TCP or UDP

2015-05-28 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 TCP - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, May 28, 2015 at 2:00 PM, Garg, Pankaj wrote: > Hi, > > Does ceph typically use TCP or UDP or something else for data

Re: [ceph-users] Read Errors and OSD Flapping

2015-06-01 Thread Robert LeBlanc
smart test. It will be a good test as to what to do in > this situation as I have a feeling this will most likely happen again. Please post back when you have a result, I'd like to know the outcome. - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA

[ceph-users] Ceph asok filling nova open files

2015-06-03 Thread Robert LeBlanc
grep ceph | wc lsof: WARNING: can't stat() tmpfs file system /run Output information may be incomplete. 10109090 125240 [root@compute3 ~]# virsh list | wc 13 34 558 - -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] Ceph asok filling nova open files

2015-06-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Thank you for pointing to the information. I'm glad a fix is already ready. I can't tell from https://github.com/ceph/ceph/pull/4657, will this be included in the next point release of hammer? Thanks, - -------- Robert L

Re: [ceph-users] External XFS Filesystem Journal on OSD

2015-06-04 Thread Robert LeBlanc
i1L0kfNU7eCI4GBdjYkyGINZno1DYG3VrS1u04UIzbp6vS8J6TkZoouh2Vv0 Gp6YzCerc4QjkM4N/4Huy+HhgSjG3aNt5VHGp9/xMGqT6ihzsAHOcPcnx+zp BLT4 =cnp+ -END PGP SIGNATURE- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jun 4, 2015 at 12:23 PM, Lars Marowsky-Bree

Re: [ceph-users] Recovering from multiple OSD failures

2015-06-05 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Did you try to deep-scrub the PG after copying it to 29? - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jun 4, 2015 at 10:26 PM, Aaron Ten Clay wrote: > Hi Cephers, > > I rec

Re: [ceph-users] Recovering from multiple OSD failures

2015-06-05 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 When copying to the primary OSD, a deep-scrub has worked for me, but I've not done this exact scenario. Did you try bouncing the OSD process? - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-09 Thread Robert LeBlanc
R+v6wxhQU0vBovH+5oAWmCZaPNT+F0Uvs3xWAxxaIR9r83wMj9qQeBZTKVzQ 1aFIi15KqAwOp12yWCmrqKTeXhjwYQNd8viCQCGN7AQyPglmzfbuEHalVjz4 oSJX =k281 -END PGP SIGNATURE----- ---- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 9, 2015 at 6:02 AM, Alexandre DERUMIER wro

Re: [ceph-users] Beginners ceph journal question

2015-06-09 Thread Robert LeBlanc
C1btATHFBzZ/WIOwhOif7kRSXS9fOi/SYEh0164HunDuAb1KjEM tI0dly7JNGPagz3VAsBXaAoxoKXB1wfkdo0I59OmFTDGwoYOLQi5jhQd7csl A7Am =1XtM -END PGP SIGNATURE----- ---- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 9, 2015 at 8:35 AM, Vickey Singh wro

Re: [ceph-users] calculating maximum number of disk and node failure that can be handled by cluster with out data loss

2015-06-09 Thread Robert LeBlanc
(in the same host) or 1 host without data loss, but I/O will block until Ceph can replicate at least one more copy (assuming the min_size 2 stated above). - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Jun 9, 2015 at 9:53 AM, kevin

Re: [ceph-users] Ceph giant installation fails on rhel 7.0

2015-06-11 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Have you configured and enabled the epel repo? - Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jun 11, 2015 at 6:26 AM, Shambhu Rajak wrote: > I am trying to install ceph gaint on r

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Did you see what the effect of just restarting the OSDs before using tcmalloc? I've noticed that there is usually a good drop for us just by restarting them. I don't think it is usually this drastic. - -------- Robert L

Re: [ceph-users] Unexpected issues with simulated 'rack' outage

2015-06-24 Thread Robert LeBlanc
D PGP SIGNATURE- -------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jun 24, 2015 at 7:44 AM, Lionel Bouton < lionel-subscript...@bouton.name> wrote: > On 06/24/15 14:44, Romero Junior wrote: > > Hi, > > > >

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Robert LeBlanc
ate memory) and they try much harder to reuse dirty free pages so memory stays within the thread again reducing locking for memory allocations. I would do some more testing along with what Ben Hines mentioned about overall client performance. - ---- Robert LeBlanc GPG Fingerprint 79A2 9

Re: [ceph-users] One of our nodes has logs saying: wrongly marked me down

2015-07-03 Thread Robert LeBlanc
You may not want to set your heartbeat grace so high, it will make I/O block for a long time in the case of a real failure. You may want to look at increasing down reporters instead. Robert LeBlanc Sent from a mobile device please excuse any typos. On Jul 2, 2015 9:39 PM, "Tuomas Jun

Re: [ceph-users] How to prefer faster disks in same pool

2015-07-09 Thread Robert LeBlanc
You could also create two roots and two rules and have the primary osd be the 10k drives so that the 7.2k are used primarily for writes. I believe that recipe is on the CRUSH page in the documentation. Robert LeBlanc Sent from a mobile device please excuse any typos. On Jul 9, 2015 10:03 PM

[ceph-users] Cluster reliability

2015-07-14 Thread Robert LeBlanc
summarize the findings and report back to the list. Thank you, - ---- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v0.13.1 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJVpUQtCRDmVDuy+mK58QAA3

Re: [ceph-users] Unsetting osd_crush_chooseleaf_type = 0

2015-07-17 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Yes you will need to change osd to host as you thought so that copies will be separated between hosts. You will run into problems you see until that is changed. It will cause data movement. - Robert LeBlanc PGP Fingerprint 79A2 9CA4

Re: [ceph-users] CephFS vs RBD

2015-07-22 Thread Robert LeBlanc
are designed to have multiple discrete machines access the file system at the same time. As long as you use a clustered file system on RBD, you will be OK. Now, if that performs better than CephFS, is a question you will have to answer through testing. - Robert LeBlanc PGP

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-24 Thread Robert LeBlanc
Sorry, autocorrect. Decompiled crush map. Robert LeBlanc Sent from a mobile device please excuse any typos. On Jul 24, 2015 9:44 AM, "Robert LeBlanc" wrote: > Please provide the recompiled crush map. > > Robert LeBlanc > > Sent from a mobile device please excuse any t

Re: [ceph-users] Algorithm for default pg_count calculation

2015-07-27 Thread Robert LeBlanc
/cCuRMR0eXL44H Or2sbJRwwM2+9kItGOskYjZLX/ELdWV+fpmy/BeJB92U+1pxOWBB2Wv11SY0 uBX2 =zbAW -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jul 27, 2015 at 8:22 AM, Konstantin Danilov wrote: > Hi all, > > I

Re: [ceph-users] Trying to remove osd

2015-07-27 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Did you kill the OSD process, you are still showing 28 OSD up. I'm not sure that should stop you from removing it though. You can also try ceph osd crush rm osd.21 - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
. Just make sure you have the correct journal in the same host with the matching OSD disk, udev should do the magic. The OSD logs are your friend if they don't start properly. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul 29, 20

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
rejoined. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul 29, 2015 at 2:46 PM, Peter Hinman wrote: > Hi Greg - > > So at the moment, I seem to be trying to resolve a permission error. > > === osd.3 === > Moun

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
start it up. I haven't actually done this, so there still may be some bumps. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul 29, 2015 at 3:44 PM, Peter Hinman wrote: > Thanks Robert - > > Where would that monitor data

Re: [ceph-users] Recovery question

2015-07-29 Thread Robert LeBlanc
, we can deal with that later if needed. - ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul 29, 2015 at 4:40 PM, Peter Hinman wrote: > Ok - that is encouraging. I've believe I've got data from a previous > monitor. I

Re: [ceph-users] questions on editing crushmap for ceph cache tier

2015-07-30 Thread Robert LeBlanc
ievXuRynDrH7D+IeUzUJ JGoj01YM60Ul1XJPWatMoM+435hcHrGd0rJ3bi91DOrZmT55X4jjdUA8z/3Y xMZE =Tqaw -END PGP SIGNATURE----- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul 29, 2015 at 9:21 PM, van wrote: > Hi, list, > > Ceph cache tier seems v

Re: [ceph-users] Recovery question

2015-07-30 Thread Robert LeBlanc
l, just in case. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jul 30, 2015 at 12:41 PM, Peter Hinman wrote: > For the record, I have been able to recover. Thank you very much for the > guidance. > > I hate searching the w

Re: [ceph-users] Elastic-sized RBD planned?

2015-07-30 Thread Robert LeBlanc
VZLyIOH1KSVhABK1RaI8+Zws6 ZWriDFal0ztd0BNEQlCqtlo4hyY/AVies9qB6V4sUL0UuEL/+iTj71/VNh09 RJIURK6KySwxtW97pMGGafw5xPOjpxnm75D6AmovZ6WV68GSxlNpTY/V7CPH dW/U =h5n5 -END PGP SIGNATURE----- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Jul

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Robert LeBlanc
+HWBQTZgLi0bxIyBEo7Pg75+f8yzawTpWWx5tM4JR yKkADT8+hid8PK6YwUrsaYnRhdRunAZv9BdvOulXyJ9azcCHSzlJWeYWHEwq krdm =m+W6 -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jul 30, 2015 at 7:54 AM, Sage Weil wrote: > As time marches on

Re: [ceph-users] OSD removal is not cleaning entry from osd listing

2015-07-31 Thread Robert LeBlanc
ATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Jul 31, 2015 at 1:15 AM, Mallikarjun Biradar wrote: > Hi, > > I had 27 OSD's in my cluster. I removed two of the OSD from (osd.20) > host-3 & (osd.22) host-

Re: [ceph-users] Check networking first?

2015-07-31 Thread Robert LeBlanc
into Ceph. The monitor can correlate failures and help determine if the problem is related to one host from the CRUSH map. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jul 30, 2015 at 11:27 PM, Stijn De Weirdt wrote: > wouldn'

Re: [ceph-users] 160 Thousand ceph-client.admin.*.asok files : Wired problem , never seen before

2015-08-09 Thread Robert LeBlanc
I'm guessing this is on an OpenStack node? There is a fix for this and I think it will come out in the next release. For now we have had to disable the admin sockets. Robert LeBlanc Sent from a mobile device please excuse any typos. On Aug 5, 2015 5:07 AM, "Vickey Singh"

Re: [ceph-users] 160 Thousand ceph-client.admin.*.asok files : Wired problem , never seen before

2015-08-10 Thread Robert LeBlanc
ran out of file descriptors. Restarting the nova process would let it work until the file descriptors were exhausted again. Robert LeBlanc Sent from a mobile device please excuse any typos. On Aug 10, 2015 1:13 AM, "Vickey Singh" wrote: > Hi Thank you for your reply > > No its these n

Re: [ceph-users] ceph distributed osd

2015-08-19 Thread Robert LeBlanc
/Z0HxDx1mjjJMQdzM+q9+D0m/EYfUpe/DxMqqMI 4t8hD5UPBhkv1sgLYSWyJ5vxLnNOZP7roe2Jp0KwwlSADM9DJb4MEx/1nNcb 6emts8KUhhtb1jsH8gu9Z0tzHcaqNE8N1z9JiveaNCjs6wTp8xbtmDB7p9k4 uZzzoIXTJWrIN/Qqukza+/+8D+WAJ618uwXCCpWi/k83RKt7iy2iv5w4EDTx 25cQ =a+24 -END PGP SIGNATURE- Robert LeBlanc PGP

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-19 Thread Robert LeBlanc
seems like it. I did some computation of a large sampling of our servers and found that the average request size was ~12K/~18K (read/write) and ~30%/70% (it looks like I didn't save that spreadsheet to get exact numbers). So, any optimization in smaller I/O sizes would really benefit us - -

Re: [ceph-users] Latency impact on RBD performance

2015-08-19 Thread Robert LeBlanc
---END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Aug 19, 2015 at 8:20 AM, Logan Barfield wrote: > Hi, > > We are currently using 2 OSD hosts with SSDs to provide RBD backed volumes > for KVM hyperviso

Re: [ceph-users] Object Storage and POSIX Mix

2015-08-21 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shouldn't this already be possible with HTTP Range requests? I don't work with RGW or S3 so please ignore me if I'm talking crazy. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

[ceph-users] OSD GHz vs. Cores Question

2015-08-21 Thread Robert LeBlanc
means a lot to our performance. Our current processors are Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz [1] http://www.spinics.net/lists/ceph-users/msg19305.html [2] http://article.gmane.org/gmane.comp.file-systems.ceph.user/22713 Thanks, - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4

[ceph-users] EXT4 for Production and Journal Question?

2015-08-24 Thread Robert LeBlanc
l? If an OSD crashed in the middle of the write with no EXT4 journal, the file system would be repaired and then Ceph would rewrite the last transaction that didn't complete? I'm sure I'm missing something here... Thanks, [1] http://www.spinics.net/lists/ceph-users/msg20839.html

Re: [ceph-users] OSD GHz vs. Cores Question

2015-08-24 Thread Robert LeBlanc
pulling servers to replace fixed disks, so we are looking at hot swap options. I'll try and do some testing in our lab, but I won't be able to get a very good spread of data due to clock and core limitations in the existing hardware. - -------- Robert LeBlanc PGP Fingerprint 79A2

Re: [ceph-users] shutdown primary monitor, status of the osds on it will not change, and command like 'rbd create xxx' would block

2015-08-27 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 The OSDs should be marked down within about 30 seconds. Can you provide additional information such as ceph version and the ceph.conf file. Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62

Re: [ceph-users] Defective Gbic brings whole Cluster down

2015-08-27 Thread Robert LeBlanc
coming from in that regard, but having Ceph kick out a misbehaving node quickly is appealing as well (there would have to be a way to specify that only so many nodes could be kicked out). - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Aug

Re: [ceph-users] Disk/Pool Layout

2015-08-27 Thread Robert LeBlanc
l-SSD) >SCSI9 (0,5,0) (sdi) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD) >SCSI9 (0,6,0) (sdj) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD) > > > Too little endurance. > Same as above [1] http://www.sandisk.com/assets/docs/WP004_OverProvisioning_WhyHow_FINA

Re: [ceph-users] Defective Gbic brings whole Cluster down

2015-08-27 Thread Robert LeBlanc
ll be implemented although I do love this idea to help against tail latency. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Aug 27, 2015 at 12:48 PM, Jan Schermer wrote: > Don't kick out the node, just deal with it gracefull

Re: [ceph-users] Disk/Pool Layout

2015-08-27 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Thu, Aug 27, 2015 at 1:13 PM, Jan Schermer wrote: > >> On 27 Aug 2015, at 20:57, Robert LeBlanc wrote: >> >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA256 >> >> >> >> >> On Thu

Re: [ceph-users] Defective Gbic brings whole Cluster down

2015-08-27 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 +1 :) - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Aug 27, 2015 at 1:16 PM, Jan Schermer wrote: Well, there's no other way to get reliable performance and SLAs compar

Re: [ceph-users] Disk/Pool Layout

2015-08-27 Thread Robert LeBlanc
GRuFM/94vCCApAgqoHiatpFeKY7cEd6x0V3YOA+j8MDbr5YWJ PCWXWyFpClyp9h9LW0uqlwE3LtYBD0ec3d4nJmqNy5v2sszWJo4UWptRhEdi XOwoda3DNnqoj5G7dmKkSrvXJqSRXA784gIMD0rO7JfXlahjCOsVaYQdo76v U+bQtxGRTXTAV+1ygOL7rElXMyc4Wo6IyUkpE6dnhFPGsi0lZnOih+kM0Wmt wt/B =mSex -END PGP SIGNATURE- Robert LeBlan

Re: [ceph-users] Disk/Pool Layout

2015-08-27 Thread Robert LeBlanc
n of echoing "temporary write through" to the scsi_disk/cache_type sysfs node, but that's not available on Ubuntu for example). I agree about the separate partition, maybe it was a problem with the SSD cache I don't remember the specifics. Your suggestion on disabling barriers pea

Re: [ceph-users] Disk/Pool Layout

2015-08-28 Thread Robert LeBlanc
them. Even with barriers enabled some assumptions the filesystems make are incorrect - for example the writes are not guaranteed to be atomic and are reordered. The only real solution is using filesystem with checksums like ZFS and more replicas or you will lose data to bad behaving caches or bit-

Re: [ceph-users] 答复: shutdown primary monitor, status of the osds on it will not change, and command like 'rbd create xxx' would block

2015-08-28 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 My next thing would be to turn up debugging on all the monitors and see if the survivors are having a problem forming a quorum. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Aug 27

Re: [ceph-users] modifying a crush rule

2015-08-28 Thread Robert LeBlanc
do on the CRUSH map. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Aug 28, 2015 at 3:02 AM, Loic Dachary wrote: Hi, Is there a way to modify a ruleset in crush map, without decompiling and recompiling it with crush tool? There are

Re: [ceph-users] How should I deal with placement group numbers when reducing number of OSDs

2015-09-01 Thread Robert LeBlanc
ta in that pool yet, that may not be feasible for you. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 1, 2015 at 6:19 AM, Jan Schermer wrote: Hi, we're in the process of changing 480G drives for 1200G drives, which should cut the

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-09-01 Thread Robert LeBlanc
SSDs with an 8 core Atom (Intel(R) Atom(TM) CPU C2750 @ 2.40GHz) and size=1 and fio with 8 jobs and QD=8 sync,direct 4K read/writes produced 2,600 IOPs. Don't get me wrong, it will help, but don't expect spectacular results. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4

Re: [ceph-users] How should I deal with placement group numbers when reducing number of OSDs

2015-09-01 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I'm not convinced that a backing pool can be removed from a caching tier. I just haven't been able to get around to trying it. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, S

<    1   2   3   4   5   >