Re: [ceph-users] Accelio & Ceph

2015-09-01 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Accelio and Ceph are still in heavy development and not ready for production. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 1, 2015 at 10:31 AM, German Anders wrote: Hi cephers

Re: [ceph-users] ceph distributed osd

2015-09-01 Thread Robert LeBlanc
ts in a cluster, you are doing that for performance, in which case I would say it is not enough. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 1, 2015 at 5:19 AM, gjprabu wrote: Hi Robert, We are going to use ceph with

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-09-01 Thread Robert LeBlanc
for my comfort. If you want to go with large boxes, I would be sure to do a lot of research and ask people here on the list about what needs to be done to get optimum performance. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 1

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-01 Thread Robert LeBlanc
/s+iBVV0xbwqOY+IO9UNUfLAKNy7E1xtpXdTpQBuokmu/4D WXg3C4u+DsZNvcziO4s/edQ1koOQm1Fcj5VnbouSqmsHpB5nHeJbGmiKNTBA 9pE/hTph56YRqOE3bq3X/ohjtziL7/e/MVF3VUisDJieaLxV9weLxKIf0W9t L7NMhX7iUIMps5ulA9qzd8qJK6yBa65BVXtk5M0A5oTA/VvxHQT6e5nSZS+Z WLjavMnmSSJT1BQZ5GkVbVqo4UVjndcXEvkBm3+McaGKliO2xvxP+U3nCKpZ js+h =4WAa -END PGP SIGNATURE- ---

Re: [ceph-users] "Geom Error" on boot with rbd volume

2014-09-24 Thread Robert LeBlanc
Could this be a lock issue? From what I understand, librbd does not create an rbd device, it is all done in userspace. I would make sure that you have unmapped the image from all machines and try it again. I haven't done a lot with librbd myself, but my co-workers have it working just fine with KVM

Re: [ceph-users] ceph debian systemd

2014-09-26 Thread Robert LeBlanc
Systemd is supposed to still use the init.d scripts if they are present, however I've run into problems with it on my CentOS 7 boxes. The biggest issue is that systemd does not like having multiple arguments to the scripts. There is a systemd directory in the Master branch that does work, but you h

Re: [ceph-users] ceph debian systemd

2014-09-26 Thread Robert LeBlanc
On Fri, Sep 26, 2014 at 10:35 AM, Sage Weil wrote: > On Fri, 26 Sep 2014, Robert LeBlanc wrote: > > > A simpler workaround is to simply run > > ceph-disk activate-all > > from rc.local. > > sage > > Thanks, I'll look into that! _

Re: [ceph-users] ceph debian systemd

2014-09-26 Thread Robert LeBlanc
ri, Sep 26, 2014 at 10:47 AM, Robert LeBlanc wrote: > > > On Fri, Sep 26, 2014 at 10:35 AM, Sage Weil wrote: > >> On Fri, 26 Sep 2014, Robert LeBlanc wrote: >> >> >> A simpler workaround is to simply run >> >> ceph-disk activate-all >>

[ceph-users] PG stuck creating

2014-09-30 Thread Robert LeBlanc
On our dev cluster, I've got a PG that won't create. We had a host fail with 10 OSDs that needed to be rebuilt. A number of other OSDs were down for a few days (did I mention this was a dev cluster?). The other OSDs eventually came up once the OSD maps caught up on them. I rebuilt the OSDs on all t

Re: [ceph-users] PG stuck creating

2014-09-30 Thread Robert LeBlanc
I rebuilt the primary OSD (29) in the hopes it would unblock whatever it was, but no luck. I'll check the admin socket and see if there is anything I can find there. On Tue, Sep 30, 2014 at 10:36 AM, Gregory Farnum wrote: > On Tuesday, September 30, 2014, Robert LeBlanc > wrote:

Re: [ceph-users] ceph, ssds, hdds, journals and caching

2014-10-03 Thread Robert LeBlanc
We have also got unrecoverable XFS errors with bcache. Our expereince is that SSD journals provide about the same performance benefit (some times better) than bcache. SSD journals are easier to set up. On Fri, Oct 3, 2014 at 5:04 AM, Vladislav Gorbunov wrote: > >Has anyone tried using bcache of

[ceph-users] Blueprints

2014-10-09 Thread Robert LeBlanc
I have a question regarding submitting blueprints. Should only people who intend to do the work of adding/changing features of Ceph submit blueprints? I'm not primarily a programmer (but can do programming if needed), but have a feature request for Ceph. Thanks, Robert Le

Re: [ceph-users] Few questions.

2014-10-21 Thread Robert LeBlanc
I'm still pretty new at Ceph so take this with a grain of salt. 1. In our experience, we have tried SSD journals and bcache, we have had more stability and performance by just using SSD journals. We have created an SSD pool with the rest of the space and it did not perform much better

Re: [ceph-users] recovery process stops

2014-10-21 Thread Robert LeBlanc
I've had issues magically fix themselves over night after waiting/trying things for hours. On Tue, Oct 21, 2014 at 1:02 PM, Harald Rößler wrote: > After more than 10 hours the same situation, I don’t think it will fix > self over time. How I can find out what is the problem. > > > Am 21.10.2014

[ceph-users] Question about logging

2014-10-30 Thread Robert LeBlanc
ideally, we would like to have each of the daemons broken out into separate files on the syslog host, including a separate audit log (so it can't be tampered with). Anyone already doing this with rsyslog and would be willing to share their rsyslog conf? Thanks, Robert LeBlanc [1] ht

[ceph-users] Red Hat/CentOS kernel-ml to get RBD module

2014-11-06 Thread Robert LeBlanc
The maintainers of the kernel-ml[1] package have graciously accepted the request to include the RBD module in the mainline kernel build[2]. This should help people test out new kernels with RBD easier if you have better things to than build new kernels. Thanks kernel-ml maintainers! Robert

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Robert LeBlanc
rtt min/avg/max/mdev = 0.130/0.157/0.190/0.016 ms IPoIB Mellanox ConnectX-3 MT27500 FDR adapter and Mellanox IS5022 QDR switch MTU set to 65520. CentOS 7.0.1406 running 3.17.2-1.el7.elrepo.x86_64 on Intel(R) Atom(TM) CPU C2750 with 32 GB of RAM. On Thu, Nov 6, 2014 at 9:46 AM, Udo Lembke wrote:

Re: [ceph-users] Typical 10GbE latency

2014-11-07 Thread Robert LeBlanc
. Robert LeBlanc Sent from a mobile device please excuse any typos. On Nov 7, 2014 4:25 AM, "Stefan Priebe - Profihost AG" < s.pri...@profihost.ag> wrote: > Hi, > > this is with intel 10GBE bondet (2x10Gbit/s) network. > rtt min/avg/max/mdev = 0.053/0.107/0.184/0.034 ms >

Re: [ceph-users] RBD kernel module for CentOS?

2014-11-07 Thread Robert LeBlanc
I believe that the kernel-ml and kernel-lt packages from ELrepo have the RBD module already built (except for CentOS7 which will get it on the next kernel release). If you want to stay with the stock kernel, I don't have a good answer. I've had to rebuild the kernel to get RBD. On Fri, Nov 7, 2014

[ceph-users] Not finding systemd files in Giant CentOS7 packages

2014-11-11 Thread Robert LeBlanc
oarch 1-0.el7 installed libcephfs1.x86_641:0.87-0.el7.centos @Ceph python-ceph.x86_64 1:0.87-0.el7.centos @Ceph Thanks, Robert Le

Re: [ceph-users] Typical 10GbE latency

2014-11-11 Thread Robert LeBlanc
Is this with a 8192 byte payload? Theoretical transfer time of 1 Gbps (you are only sending one packet so LACP won't help) one direction is 0.061 ms, double that and you are at 0.122 ms of bits in flight, then there is context switching, switch latency (store and forward assumed for 1 Gbps), etc wh

Re: [ceph-users] v0.88 released

2014-11-14 Thread Robert LeBlanc
Will there be RPMs built for this release? Thanks, On Tue, Nov 11, 2014 at 5:24 PM, Sage Weil wrote: > This is the first development release after Giant. The two main > features merged this round are the new AsyncMessenger (an alternative > implementation of the network layer) from Haomai Wang

Re: [ceph-users] Anyone deploying Ceph on Docker?

2014-11-14 Thread Robert LeBlanc
Ceph in Docker is very intriguing to me, but I understood that there were still a number of stability and implementation issues. What is your experience? Please post a link to your blog when you are done, I'd be interested in reading it. On Fri, Nov 14, 2014 at 4:15 PM, Christopher Armstrong wrot

[ceph-users] Bug or by design?

2014-11-18 Thread Robert LeBlanc
I was going to submit this as a bug, but thought I would put it here for discussion first. I have a feeling that it could be behavior by design. ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) I'm using a cache pool and was playing around with the size and min_size on the pool to see

Re: [ceph-users] Bug or by design?

2014-11-18 Thread Robert LeBlanc
On Nov 18, 2014 4:48 PM, "Gregory Farnum" wrote: > > On Tue, Nov 18, 2014 at 3:38 PM, Robert LeBlanc wrote: > > I was going to submit this as a bug, but thought I would put it here for > > discussion first. I have a feeling that it could be behavior by desig

Re: [ceph-users] Rebuild OSD's

2014-12-02 Thread Robert LeBlanc
tivity for a longer period of time impacting performance if you have not adjusted max_backfill and other related options. Robert LeBlanc On Sat, Nov 29, 2014 at 3:29 PM, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > I have 2 OSD's on two nodes top of zfs that I

Re: [ceph-users] experimental features

2014-12-05 Thread Robert LeBlanc
I prefer the third option (enumeration). I don't see a point where we would enable experimental features on our production clusters, but it would be nice to have the same bits and procedures between our dev/beta and production clusters. On Fri, Dec 5, 2014 at 10:36 AM, Sage Weil wrote: > A while

Re: [ceph-users] normalizing radosgw

2014-12-10 Thread Robert LeBlanc
I'm a big fan of /etc/*.d/ configs. Basically if the package maintained /etc/ceph.conf includes all files in /etc/ceph.d/ then I can break up the files however I'd like (mon, ods, mds, client, one per daemon, etc). Then when upgrading, I don't have to worry about the new packages trying to overwrit

Re: [ceph-users] normalizing radosgw

2014-12-10 Thread Robert LeBlanc
I guess you would have to specify the cluster name in /etc/ceph/ceph.conf? That would be my only concern. On Wed, Dec 10, 2014 at 1:28 PM, Sage Weil wrote: > On Wed, 10 Dec 2014, Robert LeBlanc wrote: > > I'm a big fan of /etc/*.d/ configs. Basically if the package maintained >

Re: [ceph-users] normalizing radosgw

2014-12-10 Thread Robert LeBlanc
If cluster is specified in /etc/default/ceph than I don't have any other reservations to your proposal. On Wed, Dec 10, 2014 at 1:39 PM, Sage Weil wrote: > On Wed, 10 Dec 2014, Robert LeBlanc wrote: > > I guess you would have to specify the cluster name in > /etc/ceph/ceph.con

Re: [ceph-users] normalizing radosgw

2014-12-10 Thread Robert LeBlanc
f. I thought there would be more input on this topic. I know that some people are vehemently opposed to *.d/, but I have really come to like it and cringe when something doesn't support it. On Wed, Dec 10, 2014 at 1:48 PM, Sage Weil wrote: > On Wed, 10 Dec 2014, Robert LeBlanc wrote: >

Re: [ceph-users] rbd snapshot slow restore

2014-12-16 Thread Robert LeBlanc
There are really only two ways to do snapshots that I know of and they have trade-offs: COW into the snapshot (like VMware, Ceph, etc): When a write is committed, the changes are committed to a diff file and the base file is left untouched. This only has a single write penalty, if you want to dis

Re: [ceph-users] rbd snapshot slow restore

2014-12-16 Thread Robert LeBlanc
On Tue, Dec 16, 2014 at 5:37 PM, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > > On 17 December 2014 at 04:50, Robert LeBlanc wrote: > > There are really only two ways to do snapshots that I know of and they > have > > trade-offs: > > > > COW in

Re: [ceph-users] Any tuning of LVM-Storage inside an VM related to ceph?

2014-12-18 Thread Robert LeBlanc
Udo, I was wondering yesterday if aligning the LVM VG to 4MB would provide any performance benefit. My hunch is that it would, much like erasure blocks on SSDs (probably not so much now). I haven't had a chance to test it though. If you do, I'd like to know your results. On Thu, Dec 18, 2014 at 1

[ceph-users] What to do when a parent RBD clone becomes corrupted

2014-12-18 Thread Robert LeBlanc
snapshot. Am I going about this the wrong way? I can see having to restore a number of VM because of corrupted clone, but I'd hate to lose all the clones because of corruption in the snapshot. I would be happy if the restored snapshot would be flattened if it was a clone of another image previou

Re: [ceph-users] Need help from Ceph experts

2014-12-18 Thread Robert LeBlanc
I'm interested to know if there is a reference to this reference architecture. It would help alleviate some of the fears we have about scaling this thing to a massive scale (10,000's OSDs). Thanks, Robert LeBlanc On Thu, Dec 18, 2014 at 3:43 PM, Craig Lewis wrote: > > > >

Re: [ceph-users] Need help from Ceph experts

2014-12-18 Thread Robert LeBlanc
.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern > > > At large scale, the biggest problem will likely be network I/O on the > inter-switch links. > > > > On Thu, Dec 18, 2014 at 3:29 PM, Robert LeBlanc > wrote: >> >> I'm interested to know if there is a refe

Re: [ceph-users] High CPU/Delay when Removing Layered Child RBD Image

2014-12-19 Thread Robert LeBlanc
Do you know if this value is not set if it uses 4MB or 4096 bytes? Thanks, Robert LeBlanc On Thu, Dec 18, 2014 at 6:51 PM, Tyler Wilson wrote: > > Okay, this is rather unrelated to Ceph but I might as well mention how > this is fixed. When using the Juno-Release OpenStack

Re: [ceph-users] Recovering from PG in down+incomplete state

2014-12-19 Thread Robert LeBlanc
osd 7 back in, it would clear up. I'm just not seeing a secondary osd for that PG. Disclaimer: I could be totally wrong. Robert LeBlanc On Thu, Dec 18, 2014 at 11:41 PM, Mallikarjun Biradar < mallikarjuna.bira...@gmail.com> wrote: > > Hi all, > > I had 12 OSD's

Re: [ceph-users] Hanging VMs with Qemu + RBD

2014-12-19 Thread Robert LeBlanc
could get congestion especially on a 1 Gb network. Robert LeBlanc On Fri, Dec 19, 2014 at 9:33 AM, Nico Schottelius < nico-ceph-us...@schottelius.org> wrote: > > Hello, > > another issue we have experienced with qemu VMs > (qemu 2.0.0) with ceph-0.80 on Ubuntu 14.04 >

Re: [ceph-users] OSD & JOURNAL not associated - ceph-disk list ?

2014-12-22 Thread Robert LeBlanc
o the correct part-uuid partition. Ceph-disk list should map the journal and the data disks after that. Robert LeBlanc On Sun, Dec 21, 2014 at 7:11 AM, Florent MONTHEL wrote: > > Hi, > > I would like to separate OSD and journal on 2 différent disks so I have : > > 1 disk /dev/sde (1

Re: [ceph-users] Need help from Ceph experts

2014-12-23 Thread Robert LeBlanc
ponents in the same OS because they can interfere with each other pretty bad. Putting them in VMs gets around some of the possible deadlocks but then there is usually not enough disk IO. That is my $0.02. Robert LeBlanc Sent from a mobile device please excuse any typos. On Dec 23, 2014 6:12 AM,

Re: [ceph-users] Crush Map and SSD Pools

2015-01-05 Thread Robert LeBlanc
It took me a while to figure out the callout script since it wasn't documented anywhere easy. This is what I wrote down, it could be helpful to you or others: 1. Add the hook script to the ceph.conf file of each OSD osd crush location hook = /path/to/script 1. Install the script a

Re: [ceph-users] rbd snapshot slow restore

2015-01-05 Thread Robert LeBlanc
sequential access on your storage systems anyways. On Fri, Dec 26, 2014 at 6:33 AM, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > On Tue, 16 Dec 2014 11:50:37 AM Robert LeBlanc wrote: > > COW into the snapshot (like VMware, Ceph, etc): > > When a write is committed,

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Robert LeBlanc
Now that the holidays are over, I'm going to bump this message to see if there are any good ideas on this. Thanks, Robert LeBlanc On Thu, Dec 18, 2014 at 2:21 PM, Robert LeBlanc wrote: > Before we base thousands of VM image clones off of one or more snapshots, > I want to test w

Re: [ceph-users] Marking a OSD a new in the OSDMap

2015-01-06 Thread Robert LeBlanc
I think because ceph-disk or ceph-deploy doesn't support --osd-uuid. On Wed, Dec 31, 2014 at 10:30 AM, Andrey Korolyov wrote: > On Wed, Dec 31, 2014 at 8:20 PM, Wido den Hollander wrote: > > On 12/31/2014 05:54 PM, Andrey Korolyov wrote: > >> On Wed, Dec 31, 2014 at 7:34 PM, Wido den Hollander

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread Robert LeBlanc
Ceph currently isn't very smart on ordering the balancing operations. It can fill a disk before moving some things off of it. So if you are close to the toofull line, it can push that OSD over. I think there is a blueprint to help with this being worked on for Hammer. You have a couple of options.

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Robert LeBlanc
On Mon, Jan 5, 2015 at 6:01 PM, Gregory Farnum wrote: > On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc wrote: >> Before we base thousands of VM image clones off of one or more snapshots, I >> want to test what happens when the snapshot becomes corrupted. I don't >>

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-06 Thread Robert LeBlanc
Can't this be done in parallel? If the OSD doesn't have an object then it is a noop and should be pretty quick. The number of outstanding operations can be limited to 100 or a 1000 which would provide a balance between speed and performance impact if there is data to be trimmed. I'm not a big fan o

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Robert LeBlanc
options...it does not look like I've enabled > anything special in terms of mount options. > > Thanks, > > Shain > > > Shain Miley | Manager of Systems and Infrastructure, Digital Media | > smi...@npr.org | 202.513.3649 > > _

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Robert LeBlanc
What fs are you running inside the RBD? On Tue, Jan 6, 2015 at 8:29 AM, Shain Miley wrote: > Hello, > > We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x > 4TB drives formatted with xfs. The cluster is running ceph version 0.80.7: > > Cluster health: > cluster 504b5794-

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-07 Thread Robert LeBlanc
Jan 6, 2015 at 4:19 PM, Josh Durgin wrote: > On 01/06/2015 10:24 AM, Robert LeBlanc wrote: >> >> Can't this be done in parallel? If the OSD doesn't have an object then >> it is a noop and should be pretty quick. The number of outstanding >> operations can be li

Re: [ceph-users] Monitors and read/write latency

2015-01-07 Thread Robert LeBlanc
Monitors are in charge of the CRUSH map. When ever there is a change to the CRUSH map, an OSD goes down, a new OSD is added, PGs are increased, etc, the monitor(s) builds a new CRUSH map and distributes it to all clients and OSDs. Once the client has the CRUSH map, it does not need to contact the m

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-07 Thread Robert LeBlanc
Seems like a message bus would be nice. Each opener of an RBD could subscribe for messages on the bus for that RBD. Anytime the map is modified a message could be put on the bus to update the others. That opens up a whole other can of worms though. Robert LeBlanc Sent from a mobile device please

Re: [ceph-users] rbd directory listing performance issues

2015-01-07 Thread Robert LeBlanc
y heavy > directories at this point. > > Also...one thing I just noticed is that the 'ls |wc' returns right > away...even in cases when right after that I do an 'ls -l' and it takes a > while. > > Thanks, > > Shain > > Shain Miley | Manager

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-08 Thread Robert LeBlanc
On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer wrote: > Which of course begs the question of why not having min_size at 1 > permanently, so that in the (hopefully rare) case of loosing 2 OSDs at the > same time your cluster still keeps working (as it should with a size of 3). The idea is that

[ceph-users] Ceph as backend for Swift

2015-01-08 Thread Robert LeBlanc
Anyone have a reference for documentation to get Ceph to be a backend for Swift? Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph PG Incomplete = Cluster unusable

2015-01-09 Thread Robert LeBlanc
On Thu, Jan 8, 2015 at 8:31 PM, Christian Balzer wrote: > On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote: > Which of course currently means a strongly consistent lockup in these > scenarios. ^o^ That is one way of putting it > Slightly off-topic and snarky, that strong consi

Re: [ceph-users] Is ceph production ready? [was: Ceph PG Incomplete = Cluster unusable]

2015-01-09 Thread Robert LeBlanc
On Fri, Jan 9, 2015 at 3:00 AM, Nico Schottelius wrote: > Even though I do not like the fact that we lost a pg for > an unknown reason, I would prefer ceph to handle that case to recover to > the best possible situation. > > Namely I wonder if we can integrate a tool that shows > which (parts of)

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Robert LeBlanc
HJ6bjO5V1W8uWGXTNFnaGbqS4v3mWk ge1qukr9et0Su0llUb8Rz3hCDqD6PfMJpquBTAB/kaanS+t0pi+00wxu7zzB zVQ/ =v4sY -END PGP SIGNATURE----- ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Sep 2, 2015 at 3:21 AM, Nick Fisk wrote: > I think this may b

Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks

2015-09-02 Thread Robert LeBlanc
Still more work to do, but wanted to share my findings. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Sep 2, 2015 at 9:50 AM, Robert LeBlanc wrote: - -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Thanks for the responses. I forg

Re: [ceph-users] Ceph read / write : Terrible performance

2015-09-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Just about how funny "ceph problems" are fixed by changing network configurations. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Sep 3, 2015 at 11:16 AM, Ian Colle wrote:

Re: [ceph-users] [Problem] I cannot start the OSD daemon

2015-09-08 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I would check that the /var/lib/ceph/osd/ceph-0/ is mounted and has the file structure for Ceph. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Sep 7, 2015 at 2:16 AM, Aaron wrote

[ceph-users] Straw2 kernel version?

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Has straw2 landed in the kernel and if so which version? Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v1.0.2 Comment: https

Re: [ceph-users] Straw2 kernel version?

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 My notes show that it should have landed in 4.1, but I also have written down that it wasn't merged yet. Just trying to get a confirmation on the version that it did land in. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD

[ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
oes backfill/recovery bypass the journal? Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v1.0.2 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJV8e5qCRDmVDuy+mK58QAAaIwQAMN5DJlhr

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Do the recovery options kick in when there is only backfill going on? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Sep 10, 2015 at 3:01 PM, Somnath Roy wrote: > Try all th

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Robert LeBlanc
urnal, but there would have to be some logic if the obect was changed as it was being replicated. Maybe just a log in the journal that the objects are starting restore and finished restore, then the journal flush knows if it needs to commit the write? - -------- Robert LeBlanc PGP Fingerprint

Re: [ceph-users] Hammer reduce recovery impact

2015-09-16 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I was out of the office for a few days. We have some more hosts to add. I'll send some logs for examination. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 11, 2015 at 12:

Re: [ceph-users] question on reusing OSD

2015-09-16 Thread Robert LeBlanc
GxS d4frVCFJYXZ+5d8b7dYTU5mbqKe59yEPq3yjAOIZPL9PWn1jHfgjylvOMyMw hihd =GGct -END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Sep 16, 2015 at 10:12 AM, John-Paul Robinson wrote: > Christian, > > Thanks for the feedback. > > I gu

Re: [ceph-users] cant get cluster to become healthy. "stale+undersized+degraded+peered"

2015-09-17 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What are your iptable rules? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Sep 17, 2015 at 1:01 AM, Stefan Eriksson wrote: > hi here is the info, I have added "ceph osd pool

[ceph-users] pgmap question

2015-09-17 Thread Robert LeBlanc
, it seems to not increment. What am I missing that causes the pgmap to change? Do these pgmap changes have to be computed by the monitors and distributed to the clients? Does the pgmap change constitute a CRUSH algorithm change? Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Robert LeBlanc
p the necessary ceph config items (ceph.conf and the OSD bootstrap keys). - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 18, 2015 at 9:06 AM, Martin Palma wrote: > Hi, > > Is it a good idea to use a software raid for t

[ceph-users] Delete pool with cache tier

2015-09-18 Thread Robert LeBlanc
es to test a high number of clones. Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v1.1.0 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJV/FISCRDmVDuy+mK58QAAsgsP/1WSFOHSkcVy6O582ECf

Re: [ceph-users] Delete pool with cache tier

2015-09-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Created request http://tracker.ceph.com/issues/13163 - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 18, 2015 at 12:06 PM, John Spray wrote: > On Fri, Sep 18, 2015 at 7:04

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Robert LeBlanc
case. My strongest recommendation is not to have swap if it is a pure OSD node. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Sep 18, 2015 at 8:15 PM, 张冬卯 wrote: > yes, a raid1 system disk is necessary, from my perspective. > &

[ceph-users] Clarification of Cache settings

2015-09-18 Thread Robert LeBlanc
hod be available in Jewell? Is the current approach still being developed? Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -BEGIN PGP SIGNATURE- Version: Mailvelope v1.1.0 Comment: https://www.mailvelope.com wsFcBA

[ceph-users] Potential OSD deadlock?

2015-09-19 Thread Robert LeBlanc
l the blocked I/O and then it was fine after rejoining the cluster. Increasing what logs and to what level would be most beneficial in this case for troubleshooting? I hope this makes sense, it has been a long day. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E

Re: [ceph-users] How to move OSD form 1TB disk to 2TB disk

2015-09-19 Thread Robert LeBlanc
. Then set the old disk to 'out'. This will keep the OSD participating in the backfills until it is empty. Once the backfill is done, stop the old OSD and remove it from the cluster. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On S

Re: [ceph-users] multi-datacenter crush map

2015-09-19 Thread Robert LeBlanc
SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sat, Sep 19, 2015 at 12:54 PM, Wouter De Borger wrote: > Ok, so if I understand correctly, for replication level 3 or 4 I would have > to use the rule > > rule replicated_rules

Re: [ceph-users] Potential OSD deadlock?

2015-09-20 Thread Robert LeBlanc
fcmW8/R7Wwe PubXTM3zNS3j3Fl4/+MZS1T3qNlKEMk+jWRC5nYwE7e1aomABY0QbHHGxgPK pFl9sm9cOKfWCRXQX4w7mMRspiMosW1X1WbmLe2cU17xtudc0rsEZycPOt3n XCwin7/+yxKwp/MSWCk/vR/pY7Q/73Pi4kKRzXpFHLpJnteZ3moATTmuSSQm 3Ggb =SJmS -----END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A

Re: [ceph-users] Potential OSD deadlock?

2015-09-20 Thread Robert LeBlanc
og.xz . Thanks, - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sun, Sep 20, 2015 at 9:03 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > We had another incident of 100 long blocked I/O th

Re: [ceph-users] Delete pool with cache tier

2015-09-20 Thread Robert LeBlanc
ssd' is tier for rbd" - ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sun, Sep 20, 2015 at 9:36 PM, Wang, Zhiqiang wrote: > I remember previously I can delete the cache pool without flushing/evicting > all the objects first. The way

Re: [ceph-users] Delete pool with cache tier

2015-09-20 Thread Robert LeBlanc
OUv59V4i0zz /I6cfzbtySewINjUwFpc6OnJJHEYkse10caBWPrK34oIXQNS2K9uw+vxo/zV sy/ciuiD2d8HgaOaC04a4dhfPq4vxTsJk940qxo0HHUCwckC9XXbwefuUicz 9R2C6/GTi7RmbfyPJcAZcxFSxSHsWcr5fYO0ZQC2bS0eknw/RV7nrTttiPO7 Jd3lMyzfB3561sbrpEMwuWgnfFWK8ptqdUxcWf+aiUPLdf11Jeh0vEmjS831 n2sbuAhgHBJajerPi4PClE0mCXmCZMNdI1BXDD7E7c4Mnr3/1tWQeqwQLq4V pF/+ =q3c/ -END P

Re: [ceph-users] Clarification of Cache settings

2015-09-20 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Sun, Sep 20, 2015 at 9:49 PM, Wang, Zhiqiang wrote: >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Robert LeBlanc >> Sent: Saturday, September 19, 2015 12

Re: [ceph-users] Potential OSD deadlock?

2015-09-20 Thread Robert LeBlanc
[stat,set-alloc-hint object_size 8388608 write_size 8388608,write 3235840~4096] 17.118f0c67 ack+ondisk+write+known_if_redirected e57590) currently waiting for rw locks ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sun, Sep 20, 2015 at 7

Re: [ceph-users] multi-datacenter crush map

2015-09-21 Thread Robert LeBlanc
p faster or CRUSH can't figure out how to distribute balanced data in an unbalanced way. You will probably want to look at primary affinity. This talks about SSD, but the same principle applies http://www.sebastien-han.fr/blog/2015/08/06/ceph-get-the-best-of-your-ssd-with-primary-affinity/

Re: [ceph-users] change ruleset with data

2015-09-21 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I think you will be OK, but you should double check on a test cluster. You should be able to revert the rulesets if the data isn't found. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 O

Re: [ceph-users] snapshot failed after enable cache tier

2015-09-21 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If I recall, there was a bug or two that was found with cache tiers and snapshots and were fixed. I hope it is being backported to Hammer. I don't know if this exactly fixes your issue. - Robert LeBlanc PGP Fingerprint 79A2

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-21 Thread Robert LeBlanc
well (and any other spindles that have journals on it), or flush the journal and create a new one on the destination host. If you are using dm-crypt, you also need to save the encryption key as it is not on the OSD FS for obvious reasons. - Robert LeBlanc PGP Fingerprint 79A2 9CA4

Re: [ceph-users] Potential OSD deadlock?

2015-09-21 Thread Robert LeBlanc
UawQ4bRnyz7zlHXQANlL1t7iF /QakoriydMW3l2WPftk4kDt4egFGhxxrCRZfA0TnVNx1DOLE9vRBKXKgTr0j miB0Ca9v9DQzVnTWhPCTfb8UdEHzozMTMEv30V3nskafPolsRJmjO04C1K7e 61R+cawG02J0RQqFMMNj3X2Gnbp/CC6JzUpQ5JPvNrvO34lcTYBWkdfwtolg 9ExB =hAcJ -END PGP SIGNATURE- ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 F

Re: [ceph-users] Potential OSD deadlock?

2015-09-21 Thread Robert LeBlanc
f =kYQ2 -END PGP SIGNATURE- ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Sep 21, 2015 at 4:33 PM, Gregory Farnum wrote: > On Mon, Sep 21, 2015 at 3:14 PM, Robert LeBlanc wrote: >> -BEGIN PGP SIGNED MESSAGE- >&g

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Robert LeBlanc
lkf1y3a4sdqHQwJ+Ew3rONilixC0abHw+GF29GjCXbYDBUeLxXoqIJXQbM TGsOz4v0AnDLzgFQIaSHyweuptyh8MKT3XJbrOOAcmZo3YmGtYYfjSF6+qXF 6PLJ =HIRW -END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 22, 2015 at 8:09 AM, Gregory Farnum

Re: [ceph-users] ceph-disk prepare error

2015-09-22 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What does `ceph-disk list` report for the partition? You may need to run `partprobe /dev/sdb`. If ceph-disk list shows prepared, then just run `ceph-disk activate /dev/sdb1`. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Robert LeBlanc
WV+7V1oIzIYvWHyl2QpBq 4/ZwVjQ43qLfuDTS4o+IJ4ztOMd26vIv6Mn6WVwKCjoCXJc8ajywR9Dy+6lL o8oJ+tn7hMc9Qy1iBhu3/QIP4WCsUf9RVeu60oahNEpde89qW32S9CZlrJDO gf4iTryRjkAhdmZIj9JiaE8jQ6dvN817D9cqs/CXKV9vhzYoM7p5YWHghBKB J3hS =0J7F -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E6

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 4.2.0-1.el7.elrepo.x86_64 - - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 22, 2015 at 3:41 PM, Samuel Just wrote: > I looked

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Robert LeBlanc
liable for ping, but still had the blocked I/O. I reduced the MTU to 1500 and checked pings (OK), but I'm still seeing the blocked I/O. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 22, 2015 at 3:52 PM, Sage Weil wrote: > O

Re: [ceph-users] Potential OSD deadlock?

2015-09-23 Thread Robert LeBlanc
jFcNP293H7/DC0mqpnmo0Clx3jkdHX+x1EXpJUtocSeI44LX KWIMhe9wXtKAoHQFEcJ0o0+wrXWMevvx33HPC4q1ULrFX0ILNx5Mo0Rp944X 4OEo =P33I -END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Sep 22, 2015 at 4:15 PM, Robert LeBl

Re: [ceph-users] Potential OSD deadlock?

2015-09-23 Thread Robert LeBlanc
n't trying to congest things. We probably already saw this issue, just didn't know it. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Sep 23, 2015 at 1:10 PM, Mark Nelson wrote: > FWIW, we've got some 40GbE

Re: [ceph-users] Basic object storage question

2015-09-23 Thread Robert LeBlanc
that. There is also a limit to the size of an object that can be stored. I think I've seen the number of 100GB thrown around. - -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Sep 23, 2015 at 7:04 PM, Cory Hawkless wrote: >

<    1   2   3   4   5   >