from:"Martin B Nielsen"

Re: [ceph-users] One stuck PG

2014-09-04 Thread Martin B Nielsen

Hi Erwin, Did you try and restart the primary osd for that pg (24) - sometimes it needs a little ..nudge that way. Otherwise what does ceph pg dump say about that pg? Cheers, Martin On Thu, Sep 4, 2014 at 9:00 AM, Erwin Lubbers wrote: > Hi, > > My cluster is giving one stuck pg which seems t

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Martin B Nielsen

Hi Dan, We took a different approach (and our cluster is tiny compared to many others) - we have two pools; normal and ssd. We use 14 disks in each osd-server; 8 platter and 4 ssd for ceph, and 2 ssd for OS/journals. We partitioned the two OS ssd as raid1 using about half the space for the OS and

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Martin B Nielsen

On Thu, Sep 4, 2014 at 10:23 PM, Dan van der Ster wrote: > Hi Martin, > > September 4 2014 10:07 PM, "Martin B Nielsen" wrote: > > Hi Dan, > > > > We took a different approach (and our cluster is tiny compared to many > others) - we have two pools; > &

Re: [ceph-users] Huge issues with slow requests

2014-09-04 Thread Martin B Nielsen

Just echoing what Christian said. Also, iirc the "currently waiting for subobs on [" could also mean a problem on those as it waits for ack from them (I might remember wrong). If that is the case you might want to check in on osd 13 & 37 as well. With the cluster load and size you should not hav

Re: [ceph-users] resizing the OSD

2014-09-09 Thread Martin B Nielsen

Hi, Or did you mean some OSD are near full while others are under-utilized? On Sat, Sep 6, 2014 at 5:04 PM, Christian Balzer wrote: > > Hello, > > On Fri, 05 Sep 2014 15:31:01 -0700 JIten Shah wrote: > > > Hello Cephers, > > > > We created a ceph cluster with 100 OSD, 5 MON and 1 MSD and most o

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Martin B Nielsen

Hi, I cannot recognize that picture; we've been using samsumg 840 pro in production for almost 2 years now - and have had 1 fail. We run a 8node mixed ssd/platter cluster with 4x samsung 840 pro (500gb) in each so that is 32x ssd. They've written ~25TB data in avg each. Using the dd you had ins

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Martin B Nielsen

the osds with Samsung journal drive compared with the Intel drive on the > same server. Something like 2-3ms for Intel vs 40-50ms for Samsungs. > > At some point we had enough with Samsungs and scrapped them. > > Andrei > > -- > > *From: *"

Re: [ceph-users] Find out the location of OSD Journal

2015-05-07 Thread Martin B Nielsen

Hi, Inside your mounted osd there is a symlink - journal - pointing to a file or disk/partition used with it. Cheers, Martin On Thu, May 7, 2015 at 11:06 AM, Patrik Plank wrote: > Hi, > > > i cant remember on which drive I install which OSD journal :-|| > Is there any command to show this? >

Re: [ceph-users] Ceph instead of RAID

2013-08-13 Thread Martin B Nielsen

Hi, I'd just like to echo what Wolfgang said about ceph being a complex system. I initially started out testing ceph with a setup much like yours. And while it overall performed ok, it was not as good as sw raid on the same machine. Also, as Mark said you'll have at very best half write speeds b

Re: [ceph-users] performance questions

2013-08-20 Thread Martin B Nielsen

Hi Jeff, I would be surprised as well - we initially tested on a 2-replica cluster with 8 nodes having 12 osd each - and went to 3-replica as we re-built the cluster. The performance seems to be where I'd expect it (doing consistent writes in a rbd VM @ ~400MB/sec on 10GbE which I'd expect is eit

Re: [ceph-users] Hardware recommendations

2013-08-26 Thread Martin B Nielsen

Hi Shain, Those R515 seem to mimic our servers (2U supermicro w. 12x 3.5" bays and 2x 2.5" in the rear for OS). Since we need a mix of SSD & platter we have 8x 4TB drives and 4x 500GB SSD + 2x 250GB SSD for OS in each node (2x 8-port LSI 2308 in IT-mode) We've partitioned 10GB from each 4x 500GB

Re: [ceph-users] Ceph with high disk densities?

2013-10-07 Thread Martin B Nielsen

Hi Scott, Just some observations from here. We run 8 nodes, 2U units with 12x OSD each (4x 500GB ssd, 8x 4TB platter) attached to 2x LSI 2308 cards. Each node uses an intel E5-2620 with 32G mem. Granted, we only have like 25 VM (some fairly io-hungry, both iops and throughput-wise though) on tha

Re: [ceph-users] SSD question

2013-10-21 Thread Martin B Nielsen

Hi, Plus reads will still come from your non-SSD disks unless you're using something like flashcache in front and as Greg said, having much more IOPS available for your db often makes a difference (depending on load, usage etc ofc). We're using Samsung Pro 840 256GB pretty much like Martin descri

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-13 Thread Martin B Nielsen

Probably common sense but I was bitten by this once in a likewise situation.. If you run 3x replica and distribute them over 3x hosts (is that default now?) make sure that the disks on the host with the failed disk have space for it - the remaining two disks will have to hold the content of the fa

Re: [ceph-users] Big or small node ?

2013-11-20 Thread Martin B Nielsen

Hi, I'd almost always go with more lesser beefy nodes than bigger ones. You're much more vulnerable if the big one(s) die and replication will not impact your cluster as much. I also find it easier to extend a cluster with smaller nodes. At least it feels like you can increase in more smooth rate

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Martin B Nielsen

Hi, We settled on Samsung pro 840 240GB drives 1½ year ago and we've been happy so far. We've over-provisioned them a lot (left 120GB unpartitioned). We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far. smartctl states something like Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB

Re: [ceph-users] SSD MTBF

2014-10-07 Thread Martin B Nielsen

A bit late getting back on this one. On Wed, Oct 1, 2014 at 5:05 PM, Christian Balzer wrote: > > smartctl states something like > > Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I > > think that is ~30TB/day if I'm doing the calc right. > > > Something very much does not ad

Re: [ceph-users] error adding OSD to crushmap

2015-01-14 Thread Martin B Nielsen

Hi Luis, I might remember wrong, but don't you need to actually create the osd first? (ceph osd create) Then you can use assign it a position using cli crushrules. Like Jason said, can you send the ceph osd tree output? Cheers, Martin On Mon, Jan 12, 2015 at 1:45 PM, Luis Periquito wrote: >

Re: [ceph-users] many blocked requests when recovery

2013-12-09 Thread Martin B Nielsen

Hi, You didn't state what version of ceph or kvm/qemu you're using. I think it wasn't until qemu 1.5.0 (1.4.2+?) that an async patch from inktank was accepted into mainstream which significantly helps in situations like this. If not using that on top of not limiting recovery threads you'll prob.

Re: [ceph-users] can one slow hardisk slow whole cluster down?

2014-01-29 Thread Martin B Nielsen

Hi, At least it used to be like that - I'm not sure if that has changed. I believe this is also part why it is adviced to go with the same kind of hw and setup if possible. Since at least rbd images are spread in objects throughout the cluster you'll prob. have to wait for a slow disk when readin

Re: [ceph-users] pages stuck unclean (but remapped)

2014-02-23 Thread Martin B Nielsen

Hi, I would prob. start by figuring out exactly what pg are stuck unclean. You can do 'ceph pg dump | grep unclean' to get that info - then if your theory holds you should be able to verify the disk(s) in question. I cannot see any _too_full so am curious what could be the cause. You can also a

Re: [ceph-users] questions about monitor data and ceph recovery

2014-02-25 Thread Martin B Nielsen

Hi Pavel, Will try and answer some of your questions: My first question will be about monitor data directory. How much space I > need to reserve for it? Can monitor-fs be corrupted if monitor goes out of > storage space? > We have about 20GB partitions for monitors - they really don't use much s

Re: [ceph-users] Trying to rescue a lost quorum

2014-03-01 Thread Martin B Nielsen

Hi, You can't form quorom with your monitors on cuttlefish if you're mixing < 0.61.5 with any 0.61.5+ ( https://ceph.com/docs/master/release-notes/ ) => section about 0.61.5. I'll advice installing pre-0.61.5, form quorom and then upgrade to 0.61.9 (if needs be) - and then latest dumpling on top.

Re: [ceph-users] Fluctuating I/O speed degrading over time

2014-03-07 Thread Martin B Nielsen

Hi, I'd probably start by looking at your nodes and check if the SSDs are saturated or if they have high write access times. If any of that is true, does that account for all SSD or just some of them? Maybe some of the disks needs a trim. Maybe test them individually directly on the cluster. If y

Re: [ceph-users] Fluctuating I/O speed degrading over time

2014-03-08 Thread Martin B Nielsen

x27;ll try accessing the ticket monday to get all the details if it is still there. Cheers, Martin > > Looking forward to your reply, thank you. > > Cheers. > > > On Fri, Mar 7, 2014 at 6:10 PM, Martin B Nielsen wrote: > >> Hi, >> >> I'd probably start b

Re: [ceph-users] OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"

2014-03-23 Thread Martin B Nielsen

Hi, I can see ~17% hardware interrupts which I find a little high - can you make sure all load is spread over all your cores (/proc/interrupts)? What about disk util once you restart them? Are they all 100% utilized or is it 'only' mostly cpu-bound? Also you're running a monitor on this node - h

Re: [ceph-users] help, add mon failed lead to cluster failure

2014-03-26 Thread Martin B Nielsen

Hi, I experienced this from time to time with older releases of ceph, but haven't stumbled upon it for some time. Often I had to revert to the older state by using: http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster and dump the monlist, find

Re: [ceph-users] MDS debugging

2014-03-31 Thread Martin B Nielsen

Hi, I can see you're running mon, mds and osd on the same server. Also, from a quick glance you're using around 13GB resident memory. If you only have 16GB in your system I'm guessing you'll be swapping about now (or close). How much mem does the system hold? Also, how busy are the disks? Or is

Re: [ceph-users] Live database files on Ceph

2014-04-04 Thread Martin B Nielsen

Hi, We're running mysql in multi-master cluster (galera), mysql standalones, postgresql, mssql and oracle db's on ceph RBD via QEMU/KVM. As someone else pointed out it is usually faster with ceph, but sometimes you'll get some odd slow reads. Latency is our biggest enemy. Oracle comes with an aw

Re: [ceph-users] Red Hat to acquire Inktank

2014-05-01 Thread Martin B Nielsen

First off, congrats to inktank! I'm sure having Redhat backing the project it will see even quicker development. My only worry is support for future non-RHEL platforms; like many others we've built our ceph stack around ubuntu and I'm just hoping it won't deteriorate into something like how it is

Re: [ceph-users] Ceph Not getting into a clean state

2014-05-09 Thread Martin B Nielsen

Hi, I experienced exactly the same with 14.04 and the 0.79 release. It was a fresh clean install with default crushmap and ceph-deploy install as pr. the quick-start guide. Oddly enough changing replica size (incl min_size) from 3 - 2 (and 2->1) and back again it worked. I didn't have time to l

Re: [ceph-users] pgs inconsistent, scrub errors

2013-02-25 Thread Martin B Nielsen

ERR] 1.73c missing primary copy of > 9d7a673c/11b30 6c./head//1, > unfound > > > Summary: pg wont repair... what do u suggest > > > Regards, > Femi. > > > On Fri, Feb 22, 2013 at 1:26 PM, Martin B Nielsen wr

Re: [ceph-users] Using different storage types on same osd hosts?

2013-03-06 Thread Martin B Nielsen

Hi, We did the opposite here; adding some SSD in free slots after having a normal cluster running with SATA. We just created a new pool for them and separated the two types. I used this as a template: http://ceph.com/docs/master/rados/operations/crush-map/?highlight=ssd#placing-different-pools-on

Re: [ceph-users] debug_osd on/off on an active ceph cluster

2013-03-07 Thread Martin B Nielsen

Hi Charles, http://ceph.com/docs/master/rados/configuration/ceph-conf/#ceph-runtime-config has a great example. For all daemons of a type use * ( ceph osd tell \* injectargs '--debug-osd 20 --debug-ms 1' ) More about loglevels here: http://ceph.com/docs/master/rados/configuration/ceph-conf/#logs

Re: [ceph-users] How to calculate the capacity of a ceph cluster

2013-03-13 Thread Martin B Nielsen

Hi Ashish, Yep, that would be the correct way to do it. If you already have a cluster running, a ceph -s will also show usage, ie like: >ceph -s pgmap v1842777: 8064 pgs: 8064 active+clean; 1069 GB data, 2144 GB used, 7930 GB / 10074 GB avail; 3569B/s wr, 0op/s This is a small test-cluster with

Re: [ceph-users] Ceph error: active+clean+scrubbing+deep

2013-04-16 Thread Martin B Nielsen

Hi Kakito, You def. _want_ scrubbing to happen! http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing If you feel it kills your system you can tweak some of the values; like: osd scrub load threshold osd scrub max interval osd deep scrub interval I have no experience in chan

Re: [ceph-users] Rebuild the monitor infrastructure

2013-04-23 Thread Martin B Nielsen

Hi Bryan, I asked the same question a few months ago: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-February/000221.html But basically, that is pretty bad; you'll be stuck on your own and would need to get in contact with Inktank - they might be able to help rebuild a monitor for you.

[ceph-users] Failed to read JournalPointer - MDS error (mds rank 0 is damaged)

2017-04-29 Thread Martin B Nielsen

Hi, We're using ceph 10.2.5 and cephfs. We had a weird monitor (mon0r0) which had some sort of meltdown as current active mds node. The monitor node called elections on/off over ~1 hour, sometimes with 5-10min between. On every occasion mds was also doing a replay, reconnect, rejoin => active (

38 matches

Mail list logo