Re: [ceph-users] ?==?utf-8?q? OSD's hang after network blip

2020-01-16 Thread Nick Fisk
On Thursday, January 16, 2020 09:15 GMT, Dan van der Ster wrote: > Hi Nick, > > We saw the exact same problem yesterday after a network outage -- a few of > our down OSDs were stuck down until we restarted their processes. > > -- Dan > > > On Wed, Jan 15, 2020

Re: [ceph-users] ?==?utf-8?q? OSD's hang after network blip

2020-01-15 Thread Nick Fisk
On Wednesday, January 15, 2020 14:37 GMT, "Nick Fisk" wrote: > Hi All, > > Running 14.2.5, currently experiencing some network blips isolated to a > single rack which is under investigation. However, it appears following a > network blip, random OSD's in unaf

[ceph-users] OSD's hang after network blip

2020-01-15 Thread Nick Fisk
happening, but I will try and inject more verbose logging next time it occurs. Not sure if anybody has come across this before or any ideas? In the past as long as OSD's have been running they have always re-joint following any network issues. Nick Sample from OSD and cluster logs below.

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-25 Thread Nick Fisk
> -Original Message- > From: Vitaliy Filippov > Sent: 23 February 2019 20:31 > To: n...@fisk.me.uk; Serkan Çoban > Cc: ceph-users > Subject: Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow > storage for db - why? > > X-Assp-URIBL failed: 'yourcmc.ru'(black.uribl.com ) >

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-25 Thread Nick Fisk
> -Original Message- > From: Konstantin Shalygin > Sent: 22 February 2019 14:23 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow > storage for db - why? > > Bluestore/RocksDB

Re: [ceph-users] Bluestore HDD Cluster Advice

2019-02-22 Thread Nick Fisk
>Yes and no... bluestore seems to not work really optimal. For example, >it has no filestore-like journal waterlining and flushes the deferred >write queue just every 32 writes (deferred_batch_ops). And when it does >that it's basically waiting for the HDD to commit and slowing down all >further wr

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-22 Thread Nick Fisk
is also pointless as only ~30GB will be used. I'm currently running 30GB partitions on my cluster with a mix of 6,8,10TB disks. The 10TB's are about 75% full and use around 14GB, this is on mainly 3x Replica RBD(4MB objects) Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of space free in BlockDB on SSD?

2018-10-30 Thread Nick Fisk
> > >> On 10/18/2018 7:49 PM, Nick Fisk wrote: > > >>> Hi, > > >>> > > >>> Ceph Version = 12.2.8 > > >>> 8TB spinner with 20G SSD partition > > >>> > > >>> Perf dump shows the followin

Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of space free in BlockDB on SSD?

2018-10-20 Thread Nick Fisk
> >> On 10/18/2018 7:49 PM, Nick Fisk wrote: > >>> Hi, > >>> > >>> Ceph Version = 12.2.8 > >>> 8TB spinner with 20G SSD partition > >>> > >>> Perf dump shows the following: > >>> > >>> "b

Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of space free in BlockDB on SSD?

2018-10-19 Thread Nick Fisk
> -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: 19 October 2018 08:15 > To: 'Igor Fedotov' ; ceph-users@lists.ceph.com > Subject: RE: [ceph-users] slow_used_bytes - SlowDB being used despite lots of > space free in BlockDB on SSD?

Re: [ceph-users] slow_used_bytes - SlowDB being used despite lots of space free in BlockDB on SSD?

2018-10-19 Thread Nick Fisk
> > On 10/18/2018 7:49 PM, Nick Fisk wrote: > > Hi, > > > > Ceph Version = 12.2.8 > > 8TB spinner with 20G SSD partition > > > > Perf dump shows the following: > > > > "bluefs": { > > "gift_bytes": 0,

[ceph-users] slow_used_bytes - SlowDB being used despite lots of space free in BlockDB on SSD?

2018-10-18 Thread Nick Fisk
D, yet 4.5GB of DB is stored on the spinning disk? Am I also understanding correctly that BlueFS has reserved 300G of space on the spinning disk? Found a previous bug tracker for something which looks exactly the same case, but should be fixed now: https://tracker.ceph.com/issues/22264 Thanks, Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Bluestore DB size and onode count

2018-09-10 Thread Nick Fisk
2 PM, Igor Fedotov wrote: > > > Hi Nick. > > > > > > On 9/10/2018 1:30 PM, Nick Fisk wrote: > >> If anybody has 5 minutes could they just clarify a couple of things > >> for me > >> > >> 1. onode count, should this be equal to the number of

[ceph-users] Bluestore DB size and onode count

2018-09-10 Thread Nick Fisk
docs. I know that different workloads will have differing overheads and potentially smaller objects. But am I understanding these figures correctly as they seem dramatically lower? Regards, Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Tiering stats are blank on Bluestore OSD's

2018-09-10 Thread Nick Fisk
"tier_flush_fail": 0, "tier_try_flush": 88942, "tier_try_flush_fail": 0, "tier_evict": 264773, "tier_whiteout": 35, "tier_dirty": 89314, "tier_clean": 89207, &q

Re: [ceph-users] help needed

2018-09-06 Thread Nick Fisk
If it helps, I’m seeing about a 3GB DB usage for a 3TB OSD about 60% full. This is with a pure RBD workload, I believe this can vary depending on what your Ceph use case is. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David Turner Sent: 06 September 2018 14:09 To

[ceph-users] New cluster issue - poor performance inside guests

2018-07-24 Thread Nick A
ng the new volume, to 140MB/sec read, 30-40k IOPs. We've tried luminous, same thing. We're only getting about 26Gbit on the 40G links but that's fine for now until we go to production, it's hardly the limiting factor here. Any ideas please? Regards, Nick __

Re: [ceph-users] CephFS+NFS For VMWare

2018-07-02 Thread Nick Fisk
Quoting Ilya Dryomov : On Fri, Jun 29, 2018 at 8:08 PM Nick Fisk wrote: This is for us peeps using Ceph with VMWare. My current favoured solution for consuming Ceph in VMWare is via RBD’s formatted with XFS and exported via NFS to ESXi. This seems to perform better than iSCSI+VMFS

Re: [ceph-users] CephFS+NFS For VMWare

2018-06-30 Thread Nick Fisk
greater concern. Thanks, Nick From: Paul Emmerich [mailto:paul.emmer...@croit.io] Sent: 29 June 2018 17:57 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] CephFS+NFS For VMWare VMWare can be quite picky about NFS servers. Some things that you should test before deploying

[ceph-users] CephFS+NFS For VMWare

2018-06-29 Thread Nick Fisk
y simple and is in use by a large proportion of the Ceph community. CephFS is a lot easier to "upset". Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Planning all flash cluster

2018-06-20 Thread Nick A
possible. Is there anything that obviously stands out as severely unbalanced? The R720XD comes with a H710 - instead of putting them in RAID0, I'm thinking a different HBA might be a better idea, any recommendations please? Regards, Nick ___

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-14 Thread Nick Fisk
ll running Filestore on this cluster and simply removing the clone object from the OSD PG folder (Note: the object won't have _head in its name) and then running a deep scrub on the PG again fixed the issue for me. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lis

Re: [ceph-users] How to fix a Ceph PG in unkown state with no OSDs?

2018-06-14 Thread Nick Fisk
I’ve seen similar things like this happen if you tend to end up with extreme weighting towards a small set of OSD’s. Crush tries a slightly different combination of OSD’s at each attempt, but in an extremely lop sided weighting, it can run out of attempts before it finds a set of OSD’s which mat

Re: [ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access)

2018-06-08 Thread Nick Fisk
http://docs.ceph.com/docs/master/ceph-volume/simple/ ? From: ceph-users On Behalf Of Konstantin Shalygin Sent: 08 June 2018 11:11 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access) Wh

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-07 Thread Nick Fisk
sing the object-store-tool, but not sure if I want to clean the clone metadata or try and remove the actual snapshot object. -Original Message- From: ceph-users On Behalf Of Nick Fisk Sent: 05 June 2018 17:22 To: 'ceph-users' Subject: Re: [ceph-users] FAILED assert(p != recover

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-05 Thread Nick Fisk
quot; snapshot object and then allow thigs to backfill? -Original Message- From: ceph-users On Behalf Of Nick Fisk Sent: 05 June 2018 16:43 To: 'ceph-users' Subject: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end()) Hi, After a RBD snapshot was removed, I

Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-05 Thread Nick Fisk
From: ceph-users On Behalf Of Paul Emmerich Sent: 05 June 2018 17:02 To: n...@fisk.me.uk Cc: ceph-users Subject: Re: [ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end()) 2018-06-05 17:42 GMT+02:00 Nick Fisk mailto:n...@fisk.me.uk> >: Hi, After a RBD snapsh

[ceph-users] FAILED assert(p != recovery_info.ss.clone_snaps.end())

2018-06-05 Thread Nick Fisk
Hi, After a RBD snapshot was removed, I seem to be having OSD's assert when they try and recover pg 1.2ca. The issue seems to follow the PG around as OSD's fail. I've seen this bug tracker and associated mailing list post, but would appreciate if anyone can give any pointers. https://tracker.cep

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-14 Thread Nick Fisk
Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs On 05/01/2018 10:19 PM, Nick Fisk wrote: > 4.16 required? > https://www.phoronix.com/scan.php?page=news_item&px=Skylake-X-P-State- > Linux- > 4.16 > I've been trying with the 4.16 kernel for the last

[ceph-users] Scrubbing impacting write latency since Luminous

2018-05-10 Thread Nick Fisk
deep scrubbing pushes it into the 30ms region. No other changes apart from the upgrade have taken place. Is anyone aware of any major changes in the way scrubbing is carried out Jewel->Luminous, which may be causing this? Thanks, Nick ___ ceph-users mail

Re: [ceph-users] Bluestore on HDD+SSD sync write latency experiences

2018-05-03 Thread Nick Fisk
Hi Dan, Quoting Dan van der Ster : Hi Nick, Our latency probe results (4kB rados bench) didn't change noticeably after converting a test cluster from FileStore (sata SSD journal) to BlueStore (sata SSD db). Those 4kB writes take 3-4ms on average from a random VM in our data centre

Re: [ceph-users] Bluestore on HDD+SSD sync write latency experiences

2018-05-03 Thread Nick Fisk
Hi Nick, On 5/1/2018 11:50 PM, Nick Fisk wrote: Hi all, Slowly getting round to migrating clusters to Bluestore but I am interested in how people are handling the potential change in write latency coming from Filestore? Or maybe nobody is really seeing much difference? As we all know, in

Re: [ceph-users] Bluestore on HDD+SSD sync write latency experiences

2018-05-03 Thread Nick Fisk
-Original Message- From: Alex Gorbachev Sent: 02 May 2018 22:05 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Bluestore on HDD+SSD sync write latency experiences Hi Nick, On Tue, May 1, 2018 at 4:50 PM, Nick Fisk wrote: > Hi all, > > > > Slowly getting rou

[ceph-users] Bluestore on HDD+SSD sync write latency experiences

2018-05-01 Thread Nick Fisk
ffected by this. Thanks, Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-01 Thread Nick Fisk
4.16 required? https://www.phoronix.com/scan.php?page=news_item&px=Skylake-X-P-State-Linux- 4.16 -Original Message- From: ceph-users On Behalf Of Blair Bethwaite Sent: 01 May 2018 16:46 To: Wido den Hollander Cc: ceph-users ; Nick Fisk Subject: Re: [ceph-users] Intel Xeon Scalable

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Nick Fisk
Hi Jake, I suspect you have hit an issue that me and a few others have hit in Luminous. By increasing the number of PG's before all the data has re-balanced, you have probably exceeded hard PG per OSD limit. See this thread https://www.spinics.net/lists/ceph-users/msg41231.html

Re: [ceph-users] BlueStore.cc: 9363: FAILED assert(0 == "unexpected error")

2018-01-26 Thread Nick Fisk
operation 10 (op 0, counting from 0) Are they out of space, or is something mis-reporting? Nick From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David Turner Sent: 26 January 2018 13:03 To: ceph-users Subject: [ceph-users] BlueStore.cc: 9363: FAILED assert(0

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-24 Thread Nick Fisk
lem completely disappeared. I've attached a graph (if it gets through) showing the memory change between 4.10 and 4.14 on the 22nd Nov Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Warren Wang > Sent: 24 January 2018 1

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Nick Fisk
Anyone with 25G ethernet willing to do the test? Would love to see what the latency figures are for that. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Maged Mokhtar Sent: 22 January 2018 11:28 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] What is the shou

Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-21 Thread Nick Fisk
How up to date is your VM environment? We saw something very similar last year with Linux VM’s running newish kernels. It turns out newer kernels supported a new feature of the vmxnet3 adapters which had a bug in ESXi. The fix was release last year some time in ESXi6.5 U1, or a workaround was to

Re: [ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-11 Thread Nick Fisk
I take my hat off to you, well done for solving that!!! > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Zdenek Janda > Sent: 11 January 2018 13:01 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Cluster crash - FAILED assert(int

[ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-04 Thread Nick Fisk
g the backend Ceph OSD's shouldn't really be at risk from these vulnerabilities, due to them not being direct user facing and could have this work around disabled? Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/li

Re: [ceph-users] Cache tiering on Erasure coded pools

2017-12-27 Thread Nick Fisk
store, small writes are supported on erasure coded pools and so that “always a bad idea” should be read as “can be a bad idea” Nick Caspar Met vriendelijke groet, Caspar Smit Systemengineer SuperNAS Dorsvlegelstraat 13 1445 PA Purmerend t: (+31) 299 410 414 e: caspars...@supern

Re: [ceph-users] Bluestore Compression not inheriting pool option

2017-12-13 Thread Nick Fisk
Thanks for confirming, logged http://tracker.ceph.com/issues/22419 > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Stefan Kooman > Sent: 12 December 2017 20:35 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Sub

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
. I get the idea around these settings to stop making too many or pools with too many PG’s. But is it correct they can break an existing pool which is maybe making the new PG on an OSD due to CRUSH layout being modified? Nick From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Beh

Re: [ceph-users] Health Error : Request Stuck

2017-12-13 Thread Nick Fisk
Ok, great glad you got your issue sorted. I’m still battling along with mine. From: Karun Josy [mailto:karunjo...@gmail.com] Sent: 13 December 2017 12:22 To: n...@fisk.me.uk Cc: ceph-users Subject: Re: [ceph-users] Health Error : Request Stuck Hi Nick, Finally, was able to correct

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk mailto:n...@fisk.me.uk> > wrote: > That doesn't look like an RBD object -- any idea who is > "client.34720596.1:212637720"? So I think these might be proxy ops from the cache tier, as there are also block ops on one of the

Re: [ceph-users] Health Error : Request Stuck

2017-12-13 Thread Nick Fisk
Hi Karun, I too am experiencing something very similar with a PG stuck in activating+remapped state after re-introducing a OSD back into the cluster as Bluestore. Although this new OSD is not the one listed against the PG’s stuck activating. I also see the same thing as you where the up set

Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
ached) is not showing in the main status that it has been blocked from peering or that there are any missing objects. I've tried restarting all OSD's I can see relating to the PG in case they needed a bit of a nudge. > > On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk wrote: > >

[ceph-users] Bluestore Compression not inheriting pool option

2017-12-12 Thread Nick Fisk
to snappy, then immediately data starts getting compressed. It seems like when a new OSD joins the cluster, it doesn't pick up the existing compression setting on the pool. Anyone seeing anything similar? I will raise a bug if anyone can confirm.

[ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
My only thought so far is to wait for this backfilling to finish and then deep-scrub this PG and see if that reveals anything? Thanks, Nick "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf (undecoded) ondisk+retry+write+ignore_cache+ignore_overl

Re: [ceph-users] what's the maximum number of OSDs per OSD server?

2017-12-10 Thread Nick Fisk
xpected to be archiving and sequential access to large (multiGB) files/objects. Nick, which physical limitations you're referring to ? Thanks. Hi Igor, I guess I meant physical annoyances rather than limitations. Being able to pull out a 1 or 2U node is always much les

Re: [ceph-users] what's the maximum number of OSDs per OSD server?

2017-12-10 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Igor Mendelev Sent: 10 December 2017 15:39 To: ceph-users@lists.ceph.com Subject: [ceph-users] what's the maximum number of OSDs per OSD server? Given that servers with 64 CPU cores (128 threads @ 2.7GHz) and up to 2TB RA

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-27 Thread Nick Fisk
few utilities that can show this in realtime. Other than that, although there could be some minor tweaks, you are probably nearing the limit of what you can hope to achieve. Nick Thanks, Best, German 2017-11-27 11:36 GMT-03:00 Maged Mokhtar mailto:mmokh...@petasan.org> >

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-18 Thread Nick Fisk
ality issues first. If that brings the figures more in line, then that could potentially steer the investigation towards why Bluestore struggles to coalesce as well as the Linux FS system. Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] O

Re: [ceph-users] bluestore - wal,db on faster devices?

2017-11-08 Thread Nick Fisk
> -Original Message- > From: Mark Nelson [mailto:mnel...@redhat.com] > Sent: 08 November 2017 21:42 > To: n...@fisk.me.uk; 'Wolfgang Lendl' > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] bluestore - wal,db on faster devices? > > > >

Re: [ceph-users] Blog post: storage server power consumption

2017-11-08 Thread Nick Fisk
Also look at the new WD 10TB Red's if you want very low use archive storage. Because they spin at 5400, they only use 2.8W at idle. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jack > Sent: 06 November 2017 22:31 > To: ceph-users@lists.c

Re: [ceph-users] Recovery operations and ioprio options

2017-11-08 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > ??? ??? > Sent: 08 November 2017 16:21 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Recovery operations and ioprio options > > Hello, > Today we use ceph jewel with: > osd

Re: [ceph-users] bluestore - wal,db on faster devices?

2017-11-08 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Mark Nelson > Sent: 08 November 2017 19:46 > To: Wolfgang Lendl > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] bluestore - wal,db on faster devices? > > Hi Wolfgang, > > You've

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-28 Thread Nick Tan
On Wed, Aug 23, 2017 at 2:28 PM, Christian Balzer wrote: > On Wed, 23 Aug 2017 13:38:25 +0800 Nick Tan wrote: > > > Thanks for the advice Christian. I think I'm leaning more towards the > > 'traditional' storage server with 12 disks - as you say they give a

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-22 Thread Nick Tan
pensive for the capacities we're looking at. I'm interested in how bluestore performs without a flash/SSD WAL/DB. In my small scale testing it seems much better than filestore so I was planning on building something without any flash/SSD. There's always the option of adding it la

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-22 Thread Nick Tan
; You may want to set up something to get a feeling for CephFS, if it's > right for you or if something else on top of RBD may be more suitable. > > I've setup a 3 node cluster, 2 OSD servers and 1 mon/mds to get a feel for ceph and cephFS. It looks pretty straightforward and perf

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-21 Thread Nick Tan
rage hosts with 1-2 OSD's each, rather than 10 or so storage nodes with 10+ OSD's to get better parallelism but I don't have any practical experience with CephFS to really judge. And I don't have enough hardware to setup a test cluster of

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-21 Thread Nick Tan
On Mon, Aug 21, 2017 at 3:58 PM, Ronny Aasen wrote: > On 21. aug. 2017 07:40, Nick Tan wrote: > >> Hi all, >> >> I'm in the process of building a ceph cluster, primarily to use cephFS. >> At this stage I'm in the planning phase and doing a lot of readi

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-20 Thread Nick Tan
On Mon, Aug 21, 2017 at 1:58 PM, Christian Balzer wrote: > On Mon, 21 Aug 2017 13:40:29 +0800 Nick Tan wrote: > > > Hi all, > > > > I'm in the process of building a ceph cluster, primarily to use cephFS. > At > > this stage I'm in the planni

[ceph-users] pros/cons of multiple OSD's per host

2017-08-20 Thread Nick Tan
all servers with single OSD's vs fewer large servers with lots of OSD's? Thanks, Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Nick Fisk
sense of the the application workflow here--and Nick appears to--but I thought it worth noting that NFSv3 and NFSv4 clients shouldn't normally need the sync mount option to achieve i/o stability with well-behaved applications. In both versions of the protocol, an application wri

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-14 Thread Nick Fisk
IO's. One thing you can try is to force the CPU's on your OSD nodes to run at C1 cstate and force their minimum frequency to 100%. This can have quite a large impact on latency. Also you don't specify your network, but 10G is a must. Nick __

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-14 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Ronny Aasen > Sent: 14 August 2017 18:55 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements > > On 10.08.2017 17:30, Gregory Farnum wrote: >

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-13 Thread Nick Fisk
Sat, Aug 12, 2017, 2:40 PM Nick Fisk mailto:n...@fisk.me.uk> > wrote: I was under the impression the memory requirements for Bluestore would be around 2-3GB per OSD regardless of capacity. CPU wise, I would lean towards working out how much total Ghz you require and then get whatever CPU yo

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-12 Thread Nick Fisk
I would possibly look at the single socket E3's or E5's. Although saying that, the recent AMD and Intel announcements also have some potentially interesting single socket Ceph potentials in the mix. Hope that helps. Nick > -Original Message- > From: ceph-users [mailto:ceph-u

Re: [ceph-users] ceph cluster experiencing major performance issues

2017-08-08 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Mclean, Patrick > Sent: 08 August 2017 20:13 > To: David Turner ; ceph-us...@ceph.com > Cc: Colenbrander, Roelof ; Payno, > Victor ; Yip, Rae > Subject: Re: [ceph-users] ceph cluster experie

Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-31 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 31 July 2017 11:36 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Thu, Jul 13, 2017 at 12:54 PM, Ilya Dryomov wrote: > &

Re: [ceph-users] RBD cache being filled up in small increases instead of 4MB

2017-07-15 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Gregory Farnum > Sent: 15 July 2017 00:09 > To: Ruben Rodriguez > Cc: ceph-users > Subject: Re: [ceph-users] RBD cache being filled up in small increases instead > of 4MB > > On Fri, Jul 14,

Re: [ceph-users] Ceph mount rbd

2017-07-14 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jason Dillaman > Sent: 14 July 2017 16:40 > To: li...@marcelofrota.info > Cc: ceph-users > Subject: Re: [ceph-users] Ceph mount rbd > > On Fri, Jul 14, 2017 at 9:44 AM, wrote: > > Gonzal

Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-12 Thread Nick Fisk
> -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: 12 July 2017 13:47 > To: 'Ilya Dryomov' > Cc: 'Ceph Users' > Subject: RE: [ceph-users] Kernel mounted RBD's hanging > > > -Original Message- > >

Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-08 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 07 July 2017 11:32 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Fri, Jul 7, 2017 at 12:10 PM, Nick Fisk wrote: > > M

Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-07 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 01 July 2017 13:19 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Sat, Jul 1, 2017 at 9:29 AM, Nick Fisk wrote: > >>

Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-01 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 30 June 2017 14:06 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Fri, Jun 30, 2017 at 2:14 PM, Nick Fisk wrote: >

Re: [ceph-users] Kernel mounted RBD's hanging

2017-06-30 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 29 June 2017 18:54 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Thu, Jun 29, 2017 at 6:22 PM, Nick Fisk wrote: > >>

Re: [ceph-users] Kernel mounted RBD's hanging

2017-06-30 Thread Nick Fisk
From: Alex Gorbachev [mailto:a...@iss-integration.com] Sent: 30 June 2017 03:54 To: Ceph Users ; n...@fisk.me.uk Subject: Re: [ceph-users] Kernel mounted RBD's hanging On Thu, Jun 29, 2017 at 10:30 AM Nick Fisk mailto:n...@fisk.me.uk> > wrote: Hi All, Putting out a call for

Re: [ceph-users] Kernel mounted RBD's hanging

2017-06-29 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 29 June 2017 16:58 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Kernel mounted RBD's hanging > > On Thu, Jun 29, 2017 at 4:30 PM, Nick Fisk wrote: > > Hi

[ceph-users] Kernel mounted RBD's hanging

2017-06-29 Thread Nick Fisk
and anything else that could be causing high load - Enabling Kernel RBD debugging (Problem maybe happens a couple of times a day, debug logging was not practical as I can't reproduce on demand) Anyone have any ideas? Thanks, Nick ___ ceph-

Re: [ceph-users] Ceph random read IOPS

2017-06-26 Thread Nick Fisk
th=[ 716], 90.00th=[ 764], 95.00th=[ 820], | 99.00th=[ 1448], 99.50th=[ 2320], 99.90th=[ 7584], 99.95th=[11712], | 99.99th=[24448] Quite a bit faster. Although these are best case figures, if any substantial workload is run, the average tends to hover around 1ms latency. Nick

Re: [ceph-users] Ceph random read IOPS

2017-06-24 Thread Nick Fisk
Apologies for the top post, I can't seem to break indents on my phone. Anyway the point of that test was as maged suggests to show the effect of serial CPU speed on latency. IO is effectively serialised by the pg lock, and so trying to reduce the time spent in this area is key. Fast cpu, fast ne

Re: [ceph-users] VMware + CEPH Integration

2017-06-22 Thread Nick Fisk
> -Original Message- > From: Adrian Saul [mailto:adrian.s...@tpgtelecom.com.au] > Sent: 19 June 2017 06:54 > To: n...@fisk.me.uk; 'Alex Gorbachev' > Cc: 'ceph-users' > Subject: RE: [ceph-users] VMware + CEPH Integration > > > Hi Alex, > > > > Have you experienced any problems with timeou

Re: [ceph-users] VMware + CEPH Integration

2017-06-17 Thread Nick Fisk
gain in our cluster the FS and Exportfs resources timeout in pacemaker. There's no mention of any slow requests or any peering..etc from the ceph logs so it's a bit of a mystery. Nick > > Best regards, > Alex Gorbachev > Storcium > > > > > > T

Re: [ceph-users] 2x replica with NVMe

2017-06-08 Thread Nick Fisk
Bluestore will make 2x Replica’s “safer” to use in theory. Until Bluestore is in use in the wild, I don’t think anyone can give any guarantees. From: i...@witeq.com [mailto:i...@witeq.com] Sent: 08 June 2017 14:32 To: nick Cc: Vy Nguyen Tan ; ceph-users Subject: Re: [ceph-users] 2x

Re: [ceph-users] 2x replica with NVMe

2017-06-08 Thread Nick Fisk
There are two main concerns with using 2x replicas, recovery speed and coming across inconsistent objects. With spinning disks their size to access speed means recovery can take a long time and increases the chance that additional failures may happen during the recovery process. NVME will re

Re: [ceph-users] Changing SSD Landscape

2017-05-18 Thread Nick Fisk
data before next year, you're a > > lot braver than me. > > An early adoption scheme with Bluestore nodes being in their own > > failure domain (rack) would be the best I could see myself doing in my > > generic cluster. > > For the 2 mission critical produ

Re: [ceph-users] Changing SSD Landscape

2017-05-17 Thread Nick Fisk
Hi Dan, > -Original Message- > From: Dan van der Ster [mailto:d...@vanderster.com] > Sent: 17 May 2017 10:29 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] Changing SSD Landscape > > I am currently pricing out some DCS3520's, for OSDs. Word

[ceph-users] Changing SSD Landscape

2017-05-17 Thread Nick Fisk
ituation and is struggling to find good P3700 400G replacements? Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-05-16 Thread nick
Thanks for the explanation. I will create a ticket on the tracker then. Cheers Nick On Tuesday, May 16, 2017 08:16:33 AM Jason Dillaman wrote: > Sorry, I haven't had a chance to attempt to reproduce. > > I do know that the librbd in-memory cache does not restrict incoming > IO

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-05-15 Thread nick
Hi Jason, did you have some time to check if you can reproduce the high memory usage? I am not sure if I should create a bug report for this or if this is expected behaviour. Cheers Nick On Monday, May 08, 2017 08:55:55 AM you wrote: > Thanks. One more question: was the image a clone o

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-05-08 Thread nick
Hi, I was using a standalone rbd image. Cheers Nick On Monday, May 08, 2017 08:55:55 AM Jason Dillaman wrote: > Thanks. One more question: was the image a clone or a stand-alone image? > > On Fri, May 5, 2017 at 2:42 AM, nick wrote: > > Hi, > > I used one of the fio exam

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-05-04 Thread nick
mpt to > repeat locally? > > On Tue, May 2, 2017 at 2:51 AM, nick wrote: > > Hi Jason, > > thanks for your feedback. I did now some tests over the weekend to verify > > the memory overhead. > > I was using qemu 2.8 (taken from the Ubuntu Cloud Archive) with librbd &

Re: [ceph-users] Intel power tuning - 30% throughput performance increase

2017-05-03 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Blair Bethwaite > Sent: 03 May 2017 09:53 > To: Dan van der Ster > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Intel power tuning - 30% throughput performance > increase > > O

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-05-01 Thread nick
=libmultip > ath/checkers/rbd.c;h=9ea0572f2b5bd41b80bf2601137b74f92bdc7278;hb=HEAD > On Thu, Apr 27, 2017 at 5:26 AM, nick wrote: > > Hi Christian, > > thanks for your answer. > > The highest value I can see for a local storage VM in our infrastructure > > is a memory overhead of 39%. This i

Re: [ceph-users] Maintaining write performance under a steady intake of small objects

2017-05-01 Thread Nick Fisk
lot easier to use with Ceph. Nick From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Patrick Dinnen Sent: 01 May 2017 19:07 To: ceph-users@lists.ceph.com Subject: [ceph-users] Maintaining write performance under a steady intake of small objects Hello Ceph-users

Re: [ceph-users] Ceph memory overhead when used with KVM

2017-04-27 Thread nick
; specific from where I'm standing. > > While non-RBD storage VMs by and large tend to be closer the specified > size, I've seen them exceed things by few % at times, too. > For example a 4317968KB RSS one that ought to be 4GB. > > Regards, > > Christian > >

  1   2   3   4   5   6   7   8   >