Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Dan van der Ster
I haven't tried, but wouldn't something like this work: step take default step chooseleaf firstn 2 type host step emit step take default step chooseleaf firstn -2 type osd step emit We use something like that for an asymmetric multi-room rule. Cheers, Dan On Apr 20, 2015 20:02, "Robert LeBlanc"

Re: [ceph-users] CRUSH rule for 3 replicas across 2 hosts

2015-04-20 Thread Dan van der Ster
On Apr 20, 2015 20:22, "Gregory Farnum" wrote: > > On Mon, Apr 20, 2015 at 11:17 AM, Dan van der Ster wrote: > > I haven't tried, but wouldn't something like this work: > > > > step take default > > step chooseleaf firstn 2 type host > > st

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-21 Thread Dan van der Ster
sting you the kernel messages during such incidents here: > http://pastebin.com/X5JRe1v3 > > I was never debugging the kernel client. Can you give me a short hint > how to increase the debug level and where the logs will be written to? > > Regards, > Christian > > Am

Re: [ceph-users] OSDs failing on upgrade from Giant to Hammer

2015-04-22 Thread Dan van der Ster
Hi, On Tue, Apr 21, 2015 at 6:05 PM, Scott Laird wrote: > > ceph-objectstore-tool --op remove --data-path /var/lib/ceph/osd/ceph-36/ > --journal-path /var/lib/ceph/osd/ceph-36/journal --pgid $id > Out of curiosity, what is the difference between above and just rm'ing the pg directory from /curre

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Dan van der Ster
Hi Sage, Alexandre et al. Here's another data point... we noticed something similar awhile ago. After we restart our OSDs the "4kB object write latency" [1] temporarily drops from ~8-10ms down to around 3-4ms. Then slowly over time the latency increases back to 8-10ms. The time that the OSDs stay

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Dan van der Ster
On Tue, May 12, 2015 at 1:07 AM, Anthony D'Atri wrote: > > > Agree that 99+% of the inconsistent PG's I see correlate directly to disk > flern. > > Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find > errors correlating. > More to this... In the case that an inconsistent P

Re: [ceph-users] Ceph mon leader oneliner?

2015-05-13 Thread Dan van der Ster
Hi, Same use case... We do: /usr/bin/ceph --admin-daemon /var/run/ceph/ceph-mon.*.asok mon_status | /usr/bin/json_reformat | /bin/grep state | /bin/grep -q leader && ... Cheers, Dan On May 13, 2015 09:35, "Kai Storbeck" wrote: > Hello fellow Ceph admins, > > I have a need to run some periodic

Re: [ceph-users] calculating maximum number of disk and node failure that can be handled by cluster with out data loss

2015-06-10 Thread Dan van der Ster
This is a CRUSH misconception. Triple drive failures only cause data loss when they share a PG (e.g. ceph pg dump .. those [x,y,z] triples of OSDs are the only ones that matter). If you have very few OSDs, then its possibly true that any combination of disks would lead to failure. But as you increa

Re: [ceph-users] calculating maximum number of disk and node failure that can be handled by cluster with out data loss

2015-06-10 Thread Dan van der Ster
> multi-terrabyte volume, and even if the probability of failure was 0.1%, > that’s several orders of magnitute too high for me to be comfortable. > > I’d like nothing more than for someone to tell me I’m wrong :-) > > Jan > >> On 10 Jun 2015, at 09:55, Dan van der Ster wrote:

Re: [ceph-users] osd_scrub_sleep, osd_scrub_chunk_{min,max}

2015-06-10 Thread Dan van der Ster
I don't know if/why they're not documented, but we use them plus the scrub stride and iopriority options too: [osd] osd scrub sleep = .1 osd disk thread ioprio class = idle osd disk thread ioprio priority = 0 osd scrub chunk max = 5 osd deep scrub stride = 1048576 Cheers, Dan On Wed, J

Re: [ceph-users] calculating maximum number of disk and node failure that can be handled by cluster with out data loss

2015-06-10 Thread Dan van der Ster
> volumes are unavailable when I lose 3 OSDs - and I don’t have that many > volumes... > > Jan > >> On 10 Jun 2015, at 10:40, Dan van der Ster wrote: >> >> I'm not a mathematician, but I'm pretty sure there are 200 choose 3 = >> 1.3 million ways you

[ceph-users] kernel: libceph socket closed (con state OPEN)

2015-06-10 Thread Daniel van Ham Colchete
Hello everyone! I have been doing some log analysis on my systems here, trying to detect problems before they affect my users. One thing I have found is that I have been seeing a lot of those logs here: Jun 10 06:47:09 10.3.1.1 kernel: [2960203.682638] libceph: osd2 10.3.1.2:6800 socket closed (c

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-10 Thread Dan van der Ster
Hi, I found something similar awhile ago within a VM. http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-November/045034.html I don't know if the change suggested by Ilya ever got applied. Cheers, Dan On Wed, Jun 10, 2015 at 1:47 PM, Nick Fisk wrote: > Hi, > > Using Kernel RBD clien

Re: [ceph-users] Restarting OSD leads to lower CPU usage

2015-06-11 Thread Dan van der Ster
Hi Jan, Can you get perf top running? It should show you where the OSDs are spinning... Cheers, Dan On Thu, Jun 11, 2015 at 11:21 AM, Jan Schermer wrote: > Hi, > hoping someone can point me in the right direction. > > Some of my OSDs have a larger CPU usage (and ops latencies) than others. If I

[ceph-users] 10d

2015-06-17 Thread Dan van der Ster
Hi, After upgrading to 0.94.2 yesterday on our test cluster, we've had 3 PGs go inconsistent. First, immediately after we updated the OSDs PG 34.10d went inconsistent: 2015-06-16 13:42:19.086170 osd.52 137.138.39.211:6806/926964 2 : cluster [ERR] 34.10d scrub stat mismatch, got 4/5 objects, 0/0

Re: [ceph-users] v0.94.2 Hammer released

2015-06-17 Thread Dan van der Ster
On Thu, Jun 11, 2015 at 7:34 PM, Sage Weil wrote: > * ceph-objectstore-tool should be in the ceph server package (#11376, Ken > Dreyer) We had a little trouble yum updating from 0.94.1 to 0.94.2: file /usr/bin/ceph-objectstore-tool from install of ceph-1:0.94.2-0.el6.x86_64 conflicts with file

Re: [ceph-users] 10d

2015-06-17 Thread Dan van der Ster
On Wed, Jun 17, 2015 at 10:52 AM, Gregory Farnum wrote: > On Wed, Jun 17, 2015 at 8:56 AM, Dan van der Ster wrote: >> Hi, >> >> After upgrading to 0.94.2 yesterday on our test cluster, we've had 3 >> PGs go inconsistent. >> >> First, immediat

Re: [ceph-users] osd_scrub_chunk_min/max scrub_sleep?

2015-06-18 Thread Dan van der Ster
Hi, On Thu, Jun 18, 2015 at 3:09 AM, Tu Holmes wrote: > Hey gang, > > Some options are just not documented well… > > What’s up with: > osd_scrub_chunk_min > osd_scrub_chunk_max Those chunk sizes set how many to objects to scrub per execution of the scrubber. > osd_scrub_sleep This sets how lon

Re: [ceph-users] Interesting postmortem on SSDs from Algolia

2015-06-18 Thread Dan van der Ster
Thanks, that's a nice article. We're pretty happy with the SSDs he lists as "Good", but note that they're not totally immune to these type of issues -- indeed we've found that bcache can crash a DC S3700, and Intel confirmed it was a firmware bug. Cheers, Dan On Wed, Jun 17, 2015 at 8:36 PM, St

Re: [ceph-users] How does CephFS export storage?

2015-06-22 Thread Dan van der Ster
Hah! Nice way to invoke Cunningham's Law ;) > how does cephfs export storage to client? Ceph exports storage via it's own protocol... the RADOS protocol for object IO and a sort of "CephFS" protocol to overlay filesystem semantics on top of RADOS. Ceph doesn't use NFS or iSCSI itself -- clients "

Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

2015-06-23 Thread Dan van der Ster
Hi Jan, I guess you have the OSD journal on the same spinning disk as the FileStore? Otherwise the synchronous writes go to the separate journal device so the fio test is less relevant. We've looked a lot at IO schedulers and concluded that ionic'ing the disk thread to idle is the best we can do. N

Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

2015-06-23 Thread Dan van der Ster
On Tue, Jun 23, 2015 at 1:37 PM, Jan Schermer wrote: > Yes, I use the same drive > > one partition for journal > other for xfs with filestore > > I am seeing slow requests when backfills are occuring - backfills hit the > filestore but slow requests are (most probably) writes going to the journal

Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

2015-06-23 Thread Dan van der Ster
prised at the CFQ behaviour - the > drive can sustain tens of thousand of reads per second, thousands of writes - > yet saturating it with reads drops the writes to 10 IOPS - that’s mind > boggling to me. > > Jan > >> On 23 Jun 2015, at 13:43, Dan van der Ster wrote:

Re: [ceph-users] IO scheduler & osd_disk_thread_ioprio_class

2015-06-23 Thread Dan van der Ster
r synchronous writes with no other >>> load. >>> The drives I’m testing have ~8K IOPS when not under load - having them drop >>> to 10 IOPS is a huge problem. If it’s indeed a CFQ problem (as I suspect) >>> then no matter what drive you have you will have problems

[ceph-users] CephFS posix test performance

2015-06-26 Thread Dan van der Ster
Hi all, Today we are running some tests with this POSIX compatibility test [1] and noticed a possible performance regression in the latest kernel. The cluster we are testing is 0.94.2. 0.94.2 ceph-fuse client: All tests successful. Files=184, Tests=1957, 83 wallclock secs ( 0.72 usr 0.16 sys +

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Dan van der Ster
On Tue, Jun 30, 2015 at 11:37 AM, Yan, Zheng wrote: > >> On Jun 30, 2015, at 15:37, Ilya Dryomov wrote: >> >> On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: >>> I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are about the >>> same. >>> >>> fuse: >>> Files=191, Tests=1964, 60 wal

Re: [ceph-users] Rados gateway / RBD access restrictions

2015-07-01 Thread Dan van der Ster
On Wed, Jul 1, 2015 at 3:10 PM, Jacek Jarosiewicz wrote: > ok, I think I found the answer to the second question: > > http://wiki.ceph.com/Planning/Blueprints/Giant/Add_QoS_capacity_to_librbd > > ..librbd doesn't support any QoS for now.. But libvirt/qemu can do QoS: see iotune in https://libvirt

Re: [ceph-users] Ceph FS - MDS problem

2015-07-03 Thread Dan van der Ster
Hi, We're looking at similar issues here and I was composing a mail just as you sent this. I'm just a user -- hopefully a dev will correct me where I'm wrong. 1. A CephFS cap is a way to delegate permission for a client to do IO with a file knowing that other clients are not also accessing that f

Re: [ceph-users] Slow requests when deleting rbd snapshots

2015-07-04 Thread Dan van der Ster
Hi, You should upgrade to the latest firefly release. Your probably suffering from the known issue with snapshot trimming. Cheers, Dan On Jul 4, 2015 10:19, "Eino Tuominen" wrote: > > Hello, > > We are running 0.80.5 on our production cluster and we are seeing slow requests when deleting rbd sn

Re: [ceph-users] Slow requests when deleting rbd snapshots

2015-07-04 Thread Dan van der Ster
http://ceph.com/releases/v0-80-8-firefly-released/ osd: fix snap trimming performance issues #9487 #9113 Cheers, Dan On Jul 4, 2015 1:37 PM, "Shinobu Kinjo" wrote: > Can you tell us when it was fixed so that we see this fix on github? > > Kinjo > > On Sat, Jul 4, 201

Re: [ceph-users] Ceph FS - MDS problem

2015-07-07 Thread Dan van der Ster
Hi Greg, On Tue, Jul 7, 2015 at 4:25 PM, Gregory Farnum wrote: >> 4. "mds cache size = 500" is going to use a lot of memory! We have >> an MDS with just 8GB of RAM and it goes OOM after delegating around 1 >> million caps. (this is with mds cache size = 10, btw) > > Hmm. We do have some

Re: [ceph-users] strange issues after upgrading to SL6.6 and latest kernel

2015-07-14 Thread Dan van der Ster
Hi, This reminds me of when a buggy leveldb package slipped into the ceph repos (http://tracker.ceph.com/issues/7792). Which version of leveldb do you have installed? Cheers, Dan On Tue, Jul 14, 2015 at 3:39 PM, Barry O'Rourke wrote: > Hi, > > I managed to destroy my development cluster yesteday

Re: [ceph-users] 10d

2015-07-17 Thread Dan van der Ster
nsigned long, int, ThreadPool::TPHandle*)+0xc16) [0x975a06] 2: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x97d794] 3: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x2a0) [0x97da50] 4: (ThreadPool::worker(ThreadPool::WorkThre

Re: [ceph-users] 10d

2015-07-17 Thread Dan van der Ster
egory Farnum wrote: > I think you'll need to use the ceph-objectstore-tool to remove the > PG/data consistently, but I've not done this — David or Sam will need > to chime in. > -Greg > > On Fri, Jul 17, 2015 at 2:15 PM, Dan van der Ster wrote: >> Hi Greg + lis

Re: [ceph-users] 10d

2015-07-17 Thread Dan van der Ster
A bit of progress: rm'ing everything from inside current/36.10d_head/ actually let the OSD start and continue deleting other PGs. Cheers, Dan On Fri, Jul 17, 2015 at 3:26 PM, Dan van der Ster wrote: > Thanks for the quick reply. > > We /could/ just wipe these OSDs and start fro

Re: [ceph-users] 10d

2015-07-22 Thread Dan van der Ster
I just filed a ticket after trying ceph-objectstore-tool: http://tracker.ceph.com/issues/12428 On Fri, Jul 17, 2015 at 3:36 PM, Dan van der Ster wrote: > A bit of progress: rm'ing everything from inside current/36.10d_head/ > actually let the OSD start and continue deleting other PGs.

[ceph-users] PGs going inconsistent after stopping the primary

2015-07-22 Thread Dan van der Ster
Hi Ceph community, Env: hammer 0.94.2, Scientific Linux 6.6, kernel 2.6.32-431.5.1.el6.x86_64 We wanted to post here before the tracker to see if someone else has had this problem. We have a few PGs (different pools) which get marked inconsistent when we stop the primary OSD. The problem is stra

Re: [ceph-users] PGs going inconsistent after stopping the primary

2015-07-22 Thread Dan van der Ster
ror originally occurred to debug further. > -Sam > > - Original Message - > From: "Dan van der Ster" > To: ceph-users@lists.ceph.com > Sent: Wednesday, July 22, 2015 7:49:00 AM > Subject: [ceph-users] PGs going inconsistent after stopping the primary > > H

Re: [ceph-users] PGs going inconsistent after stopping the primary

2015-07-23 Thread Dan van der Ster
pools used > for? > -Sam > > - Original Message - > From: "Dan van der Ster" > To: "Samuel Just" > Cc: ceph-users@lists.ceph.com > Sent: Wednesday, July 22, 2015 12:36:53 PM > Subject: Re: [ceph-users] PGs going inconsistent after stopping t

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-07-27 Thread Dan van der Ster
On Mon, Jul 27, 2015 at 2:51 PM, Wido den Hollander wrote: > I'm testing with it on 48-core, 256GB machines with 90 OSDs each. This > is a +/- 20PB Ceph cluster and I'm trying to see how much we would > benefit from it. Cool. How many OSDs total? Cheers, Dan _

Re: [ceph-users] OSD RAM usage values

2015-07-28 Thread Dan van der Ster
On Tue, Jul 28, 2015 at 12:07 PM, Gregory Farnum wrote: > On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman > wrote: >> >> >> On 07/17/2015 02:50 PM, Gregory Farnum wrote: >>> >>> On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman >>> wrote: Hi all, I've read in the documenta

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-11 Thread Dan van der Ster
On Tue, Aug 4, 2015 at 9:48 PM, Stefan Priebe wrote: > Hi, > > Am 04.08.2015 um 21:16 schrieb Ketor D: >> >> Hi Stefan, >>Could you describe more about the linger ops bug? >>I'm runing Firefly as you say still has this bug. > > > It will be fixed in next ff release. > > This on: >

[ceph-users] ceph monitoring with graphite

2015-08-26 Thread Dan van der Ster
Hi Wido, On Wed, Aug 26, 2015 at 10:36 AM, Wido den Hollander wrote: > I'm sending pool statistics to Graphite We're doing the same -- stripping invalid chars as needed -- and I would guess that lots of people have written similar json2graphite convertor scripts for Ceph monitoring in the recent

Re: [ceph-users] ceph monitoring with graphite

2015-08-27 Thread Dan van der Ster
On Wed, Aug 26, 2015 at 6:52 PM, John Spray wrote: > On Wed, Aug 26, 2015 at 3:33 PM, Dan van der Ster wrote: >> Hi Wido, >> >> On Wed, Aug 26, 2015 at 10:36 AM, Wido den Hollander wrote: >>> I'm sending pool statistics to Graphite >> >> We're

Re: [ceph-users] crushtool won't compile its own output

2013-07-08 Thread Dan Van Der Ster
t find a way to escape the space in crush.txt (tried \ and ' '). I gather that either crushtool needs a patch to support spaces or the ceph osd crush commands need to forbid them... Cheers, Dan Dan Van Der Ster wrote: Hi, We are just deploying a new cluster (0.61.4) and no

Re: [ceph-users] crushtool won't compile its own output

2013-07-08 Thread Dan van der Ster
(ceph.com/tracker) describing > > this issue? :) > > Already there, > > http://tracker.ceph.com/issues/4779 > Doh, need to look harder next time. Cheers, Dan > > -Greg > > Software Engineer #42 @ http://inktank.com | http://ceph.com > >

[ceph-users] mon down for 3 hours after clocksource glitch

2013-07-18 Thread Dan van der Ster
h was so sensitive to this glitch. It is good that mon.2 managed to recover eventually, but does anyone have an idea why it took 3 hours??!! thx, Dan -- Dan van der Ster CERN IT-DSS ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] mon down for 3 hours after clocksource glitch

2013-07-18 Thread Dan van der Ster
On Thu, Jul 18, 2013 at 6:27 PM, Sage Weil wrote: > Hi Dan, > > On Thu, 18 Jul 2013, Dan van der Ster wrote: >> Hi, >> Last night our cluster became unhealthy for 3 hours after one of the mons (a >> qemu-kvm VM) had this glitch: >> >> Jul 18 00:12:43 andy03

Re: [ceph-users] mon down for 3 hours after clocksource glitch

2013-07-18 Thread Dan van der Ster
On Thu, Jul 18, 2013 at 9:29 PM, Sage Weil wrote: > this sounds exactly like the problem we just fixed in v0.61.5. Glad to hear that. Thanks for the quick help :) dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.

[ceph-users] optimizing recovery throughput

2013-07-18 Thread Dan van der Ster
least with this current set of test objects we have? Or am I missing another option that should be tweaked to get more recovery throughput? Thanks in advance, Dan -- Dan van der Ster CERN IT-DSS ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] v0.61.5 Cuttlefish update released

2013-07-19 Thread Dan van der Ster
Was that 0.61.4 -> 0.61.5? Our upgrade of all mons and osds on SL6.4 went without incident. -- dan -- Dan van der Ster CERN IT-DSS On Friday, July 19, 2013 at 9:00 AM, Stefan Priebe - Profihost AG wrote: > crash is this one: > > 2013-07-19 08:59:32.137646 7f484a872780 0

Re: [ceph-users] optimizing recovery throughput

2013-07-20 Thread Dan van der Ster
On Sat, Jul 20, 2013 at 7:28 AM, Mikaël Cluseau wrote: > HI, > > > On 07/19/13 07:16, Dan van der Ster wrote: >> >> and that gives me something like this: >> >> 2013-07-18 21:22:56.546094 mon.0 128.142.142.156:6789/0 27984 : [INF] >> pgmap v112308:

Re: [ceph-users] Monitor is unable to start after reboot: OSDMonitor::update_from_paxos(bool*) FAILED assert(latest_bl.length() != 0

2013-07-23 Thread Dan van der Ster
On Tuesday, July 23, 2013 at 4:46 PM, pe...@2force.nl wrote: > On 2013-07-22 18:20, Joao Eduardo Luis wrote: > > On 07/22/2013 04:59 PM, pe...@2force.nl (mailto:pe...@2force.nl) wrote: > > > Hi Joao, > > > > > > I have sent you the link to the monitor files. I stopped one other > > > monitor to h

Re: [ceph-users] ceph monitors stuck in a loop after install with ceph-deploy

2013-07-23 Thread Dan van der Ster
On Wednesday, July 24, 2013 at 7:19 AM, Sage Weil wrote: > On Wed, 24 Jul 2013, S?bastien RICCIO wrote: > > > > Hi! While trying to install ceph using ceph-deploy the monitors nodes are > > stuck waiting on this process: > > /usr/bin/python /usr/sbin/ceph-create-keys -i a (or b or c) > > > > I tr

Re: [ceph-users] Defective ceph startup script

2013-07-31 Thread Dan van der Ster
Wild guess, but are you by chance using the ceph-run wrapper around the daemons (enabled with docrun or --restart in the init script, if memory serves)? I noticed similar strangeness (can't stop daemon, can't check status) using ceph-run on a RHEL6-like distro a few months back, with bobtail. -- da

Re: [ceph-users] v0.67 Dumpling released

2013-08-14 Thread Dan van der Ster
On Wed, Aug 14, 2013 at 11:35 AM, Markus Goldberg wrote: > is it ok to upgrade from 0.66 to 0.67 by just running 'apt-get upgrade' and > rebooting the nodes one by one ? Did you see http://ceph.com/docs/master/release-notes/#upgrading-from-v0-66 ?? ___

Re: [ceph-users] v0.67 Dumpling released

2013-08-14 Thread Dan van der Ster
http://ceph.com/rpm-dumpling/el6/x86_64/ -- Dan van der Ster CERN IT-DSS On Wednesday, August 14, 2013 at 4:17 PM, Kyle Hutson wrote: > Any suggestions for upgrading CentOS/RHEL? The yum repos don't appear to have > been updated yet. > > I thought maybe with the "im

[ceph-users] striped rbd volumes with Cinder

2013-08-15 Thread Dan Van Der Ster
Hi, Did anyone manage to use striped rbd volumes with OpenStack Cinder (Grizzly)? I noticed in the current OpenStack master code that there are options for striping the new _backup_ volumes, but there's still nothing to do with striping in the master Cinder rbd driver. Is there a way to set some

Re: [ceph-users] striped rbd volumes with Cinder

2013-08-16 Thread Dan Van Der Ster
quot;Always give 100%. Unless you're giving blood." > > On August 15, 2013 at 4:45:54 PM, Dan Van Der Ster > (daniel.vanders...@cern.ch) wrote: > >> Hi, >> Did anyone manage to use striped rbd volumes with OpenStack Cinder >> (Grizzly)? I noticed in th

[ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-09-18 Thread Dan Van Der Ster
Hi, We just finished debugging a problem with RBD-backed Glance image creation failures, and thought our workaround would be useful for others. Basically, we found that during an image upload, librbd on the glance api server was consuming many many processes, eventually hitting the 1024 nproc li

Re: [ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-09-18 Thread Dan Van Der Ster
On Sep 18, 2013, at 11:50 PM, Gregory Farnum wrote: > On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster > wrote: >> Hi, >> We just finished debugging a problem with RBD-backed Glance image creation >> failures, and thought our workaround would be useful for others.

Re: [ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-09-20 Thread Dan Van Der Ster
On Sep 19, 2013, at 6:10 PM, Gregory Farnum wrote: > On Wed, Sep 18, 2013 at 11:43 PM, Dan Van Der Ster > wrote: >> >> On Sep 18, 2013, at 11:50 PM, Gregory Farnum >> wrote: >> >>> On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster >>> w

Re: [ceph-users] PG distribution scattered

2013-09-20 Thread Dan Van Der Ster
On Sep 19, 2013, at 3:43 PM, Mark Nelson wrote: > If you set: > > osd pool default flag hashpspool = true > > Theoretically that will cause different pools to be distributed more randomly. The name seems to imply that it should be settable per pool. Is that possible now? If set globally, do

[ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan Van Der Ster
onflicts.) 4. If the checksum is already stored per object in the OSD, is this retrievable by librados? We have some applications which also need to know the checksum of the data and this would be handy if it was already calculated by Ceph. Thanks in advance! Dan

Re: [ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan van der Ster
t the moment to check for myself, and the answer is relevant to this discussion anyway). Cheers, Dan Sage Weil wrote: >On Wed, 16 Oct 2013, Dan Van Der Ster wrote: >> Hi all, >> There has been some confusion the past couple days at the CHEP >> conference during conversations

Re: [ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan van der Ster
On Wed, Oct 16, 2013 at 6:12 PM, Sage Weil wrote: >> 3. During deep scrub of an object with 2 replicas, suppose the checksum is >> different for the two objects -- which object wins? (I.e. if you store the >> checksum locally, this is trivial since the consistency of objects can be >> evaluated

Re: [ceph-users] Help with CRUSH maps

2013-10-31 Thread Dan van der Ster
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN wrote: > step take example > step emit This is the problem, AFAICT. Just omit those two lines in both rules and it should work. Cheers, dan ___ ceph-users mailing list ceph-users@lists.

Re: [ceph-users] Help with CRUSH maps

2013-10-31 Thread Dan van der Ster
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN wrote: > -11 0 drive hdd > -21 0 datacenter hdd-dc1 > -1020 room hdd-dc1-A > -5030 host A-ceph-osd-2 > 20 0

Re: [ceph-users] Mapping rbd's on boot

2013-11-14 Thread Dan Van Der Ster
Hi, We’re trying the same, on SLC. We tried rbdmap but it seems to have some ubuntu-isms which cause errors. We also tried with rc.local, and you can map and mount easily, but at shutdown we’re seeing the still-mapped images blocking a machine from shutting down (libceph connection refused error

[ceph-users] radosgw-admin log show

2013-11-28 Thread Dan Van Der Ster
Dear users/experts, Does anyone know how to use radosgw-admin log show? It seems to not properly read the --bucket parameter. # radosgw-admin log show --bucket=asdf --date=2013-11-28-09 --bucket-id=default.7750582.1 error reading log 2013-11-28-09-default.7750582.1-: (2) No such file or directo

Re: [ceph-users] ZFS on Ceph (rbd-fuse)

2013-11-29 Thread Dan van der Ster
On Fri, Nov 29, 2013 at 12:13 PM, Charles 'Boyo wrote: > That's because qemu-kvm > in CentOS 6.4 doesn't support librbd. RedHat just added RBD support in qemu-kvm-rhev in RHEV 6.5. I don't know if that will trickle down to CentOS but you can probably recompile it yourself like we did. https://rh

Re: [ceph-users] qemu-kvm packages for centos

2013-12-02 Thread Dan Van Der Ster
Hi, See this one also: http://tracker.ceph.com/issues/6365 But I’m not sure the Inktank patched qemu-kvm is relevant any longer since RedHat just released qemu-kvm-rhev with RBD support. Cheers, Dan On 02 Dec 2013, at 15:36, Darren Birkett wrote: > Hi List, > > Any chance the following will b

Re: [ceph-users] qemu-kvm packages for centos

2013-12-02 Thread Dan Van Der Ster
rdb is actually enabled. when/if I figure it out I will post it to the list. On 12/02/2013 10:46 AM, Dan Van Der Ster wrote: > Hi, > See this one also: http://tracker.ceph.com/issues/6365 > But I’m not sure the Inktank patched qemu-kvm is relevant any longer since > RedHat just relea

Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Dan Van Der Ster
Hi Mike, > On 25 Sep 2014, at 17:47, Mike Dawson wrote: > > On 9/25/2014 11:09 AM, Sage Weil wrote: >> v0.67.11 "Dumpling" >> === >> >> This stable update for Dumpling fixes several important bugs that affect a >> small set of users. >> >> We recommend that all Dumpling users u

[ceph-users] ceph osd replacement with shared journal device

2014-09-26 Thread Dan Van Der Ster
Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? Suppose you have 5 spinning disks (sde,sdf,sdg,sdh,sdi) and these each have a journal partition on sda (sda1-5). Now sde fails and is replaced with a new drive.

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi Wido, > On 26 Sep 2014, at 23:14, Wido den Hollander wrote: > > On 26-09-14 17:16, Dan Van Der Ster wrote: >> Hi, >> Apologies for this trivial question, but what is the correct procedure to >> replace a failed OSD that uses a shared journal device? >> >

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi, > On 29 Sep 2014, at 10:01, Daniel Swarbrick > wrote: > > On 26/09/14 17:16, Dan Van Der Ster wrote: >> Hi, >> Apologies for this trivial question, but what is the correct procedure to >> replace a failed OSD that uses a shared journal device? >> >

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
’m going to trace what is happening with ceph-disk prepare /dev/sde /dev/sda1 and try to coerce that to use the persistent name. Cheers, Dan > > Best of luck. > > Owen > > > > > > On 09/29/2014 10:24 AM, Dan Van Der Ster wrote: >> Hi, >> >

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Dan Van Der Ster
Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! The conventional wisdom has been to use the Intel DC S3700 because of its massive durability. Anyway, I’m curious what do the SMART counters say o

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
> On 29 Sep 2014, at 10:47, Dan Van Der Ster wrote: > > Hi Owen, > >> On 29 Sep 2014, at 10:33, Owen Synge wrote: >> >> Hi Dan, >> >> At least looking at upstream to get journals and partitions persistently >> working, this requires gpt part

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Dan Van Der Ster
> On 30 Sep 2014, at 16:38, Mark Nelson wrote: > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: >> Hi Emmanuel, >> This is interesting, because we’ve had sales guys telling us that those >> Samsung drives are definitely the best for a Ceph journal O_o ! > >

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-15 Thread Dan van der Ster
Hi Chad, That sounds bizarre to me, and I can't reproduce it. I added an osd (which was previously not in the crush map) to a fake host=test: ceph osd crush create-or-move osd.52 1.0 rack=RJ45 host=test that resulted in some data movement of course. Then I removed that osd from the crush map

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-15 Thread Dan van der Ster
Hi, October 15 2014 7:05 PM, "Chad Seys" wrote: > Hi Dan, > I'm using Emperor (0.72). Though I would think CRUSH maps have not changed > that much btw versions? I'm using dumpling, with the hashpspool flag enabled, which I believe could have been the only difference. >> That sounds bizarre to

[ceph-users] converting legacy puppet-ceph configured OSDs to look like ceph-deployed OSDs

2014-10-15 Thread Dan van der Ster
Hi Ceph users, (sorry for the novel, but perhaps this might be useful for someone) During our current project to upgrade our cluster from disks-only to SSD journals, we've found it useful to convert our legacy puppet-ceph deployed cluster (using something like the enovance module) to one that loo

Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Dan van der Ster
Hi, October 24 2014 5:28 PM, "HURTEVENT VINCENT" wrote: > Hello, > > I was running a multi mon (3) Ceph cluster and in a migration move, I > reinstall 2 of the 3 monitors > nodes without deleting them properly into the cluster. > > So, there is only one monitor left which is stuck in probing

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-27 Thread Dan van der Ster
Hi, October 27 2014 5:07 PM, "Wido den Hollander" wrote: > On 10/27/2014 04:30 PM, Mike wrote: > >> Hello, >> My company is plaining to build a big Ceph cluster for achieving and >> storing data. >> By requirements from customer - 70% of capacity is SATA, 30% SSD. >> First day data is storing i

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-28 Thread Dan Van Der Ster
> On 28 Oct 2014, at 08:25, Robert van Leeuwen > wrote: > >> By now we decide use a SuperMicro's SKU with 72 bays for HDD = 22 SSD + >> 50 SATA drives. >> Our racks can hold 10 this servers and 50 this racks in ceph cluster = >> 36000 OSD's, >>

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-28 Thread Dan Van Der Ster
> On 28 Oct 2014, at 09:30, Christian Balzer wrote: > > On Tue, 28 Oct 2014 07:46:30 +0000 Dan Van Der Ster wrote: > >> >>> On 28 Oct 2014, at 08:25, Robert van Leeuwen >>> wrote: >>> >>>> By now we decide use a SuperMicro's SKU

Re: [ceph-users] Scrub proces, IO performance

2014-10-28 Thread Dan Van Der Ster
Hi, You should try the new osd_disk_thread_ioprio_class / osd_disk_thread_ioprio_priority options. Cheers, dan On 28 Oct 2014, at 09:27, Mateusz Skała mailto:mateusz.sk...@budikom.net>> wrote: Hello, We are using Ceph as a storage backend for KVM, used for hosting MS Windows RDP, Linux for we

[ceph-users] RHEL6.6 upgrade (selinux-policy-targeted) triggers slow requests

2014-10-29 Thread Dan Van Der Ster
Hi RHEL/CentOS users, This is just a heads up that we observe slow requests during the RHEL6.6 upgrade. The upgrade includes selinux-policy-targeted, which runs this during the update: /sbin/restorecon -i -f - -R -p -e /sys -e /proc -e /dev -e /mnt -e /var/tmp -e /home -e /tmp -e /dev re

Re: [ceph-users] Delete pools with low priority?

2014-10-30 Thread Dan van der Ster
Hi Daniel, I can't remember if deleting a pool invokes the snap trimmer to do the actual work deleting objects. But if it does, then it is most definitely broken in everything except latest releases (actual dumpling doesn't have the fix yet in a release). Given a release with those fixes (see track

Re: [ceph-users] Delete pools with low priority?

2014-10-30 Thread Dan van der Ster
October 30 2014 11:32 AM, "Daniel Schneller" wrote: > On 2014-10-30 10:14:44 +, Dan van der Ster said: > >> Hi Daniel, >> I can't remember if deleting a pool invokes the snap trimmer to do the >> actual work deleting objects. But if it does,

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Dan van der Ster
There's this one: http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/ But that hasn't been updated since July. Cheers, Dan On Mon Nov 03 2014 at 5:35:23 AM Alexandre DERUMIER wrote: > Hi, > > I would like to known if a repository is available for rhel7/centos7 with >

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Dan van der Ster
Between two hosts on an HP Procurve 6600, no jumbo frames: rtt min/avg/max/mdev = 0.096/0.128/0.151/0.019 ms Cheers, Dan On Thu Nov 06 2014 at 2:19:07 PM Wido den Hollander wrote: > Hello, > > While working at a customer I've ran into a 10GbE latency which seems > high to me. > > I have access

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
Hi, I've only ever seen (1), EIO to read a file. In this case I've always just killed / formatted / replaced that OSD completely -- that moves the PG to a new master and the new replication "fixes" the inconsistency. This way, I've never had to pg repair. I don't know if this is a best or even good

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
IIRC, the EIO we had also correlated with a SMART status that showed the disk was bad enough for a warranty replacement -- so yes, I replaced the disk in these cases. Cheers, Dan On Thu Nov 06 2014 at 2:44:08 PM GuangYang wrote: > Thanks Dan. By "killed/formatted/replaced the OSD", did you repl

Re: [ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)

2014-11-13 Thread Dan van der Ster
Hi, Did you mkjournal the reused journal? ceph-osd -i $ID --mkjournal Cheers, Dan On Thu Nov 13 2014 at 2:34:51 PM Anthony Alba wrote: > When I create a new OSD with a block device as journal that has > existing data on it, ceph is causing FAILED assert. The block device > iss a journal fr

Re: [ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)

2014-11-13 Thread Dan van der Ster
Hi, On Thu Nov 13 2014 at 3:35:55 PM Anthony Alba wrote: > Ah no. > On 13 Nov 2014 21:49, "Dan van der Ster" > wrote: > >> Hi, >> Did you mkjournal the reused journal? >> >>ceph-osd -i $ID --mkjournal >> >> Cheers, Dan >>

[ceph-users] Client forward compatibility

2014-11-20 Thread Dan van der Ster
Hi all, What is compatibility/incompatibility of dumpling clients to talk to firefly and giant clusters? I know that tunables=firefly will prevent dumpling clients from talking to a firefly cluster, but how about the existence or not of erasure pools? Can a dumpling client talk to a Firefly/Giant e

Re: [ceph-users] Client forward compatibility

2014-11-25 Thread Dan Van Der Ster
Hi Greg, > On 24 Nov 2014, at 22:01, Gregory Farnum wrote: > > On Thu, Nov 20, 2014 at 9:08 AM, Dan van der Ster > wrote: >> Hi all, >> What is compatibility/incompatibility of dumpling clients to talk to firefly >> and giant clusters? > > We sadly don

<    1   2   3   4   5   6   7   8   9   >