Re: [ceph-users] Please help me get rid of Slow / blocked requests

2018-05-01 Thread Van Leeuwen, Robert
ble performance of your cluster. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Mart van Santen
rs mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Mart van Santen Greenhost E: m...@greenhost.nl T: +31 20 4890444 W: https://greenhost.nl A PGP signature can be attached to this e-mail, you need PGP software to verify it. My public key

Re: [ceph-users] use object size of 32k rather than 4M

2015-12-23 Thread Van Leeuwen, Robert
n the beginning but it will grind to an hold when the cluster gets fuller due to inodes no longer being in memory. Also this does not take in any other bottlenecks you might hit in ceph which other users can probably answer better. Cheers, Robert van Leeuwen ___

Re: [ceph-users] use object size of 32k rather than 4M

2015-12-23 Thread Van Leeuwen, Robert
52.4 Mi Used Inodes: 0.6% 52 million files without extended attributes is probably not a real life scenario for a filled up ceph node with multiple OSDs. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-23 Thread Mart van Santen
list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> -- >>> Wido den Hollander >>> 42on B.V. >>> Ceph trainer and consultant >>> >>> Phone:

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-23 Thread Mart van Santen
Hello, On 12/23/2015 04:38 PM, Lionel Bouton wrote: > Le 23/12/2015 16:18, Mart van Santen a écrit : >> Hi all, >> >> >> On 12/22/2015 01:55 PM, Wido den Hollander wrote: >>> On 22-12-15 13:43, Andrei Mikhailovsky wrote: >>>> Hello guys, >&

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-24 Thread Mart van Santen
g 79 MB/s on the single job test. > > In my case the SSDs are running off the onboard Intel C204 chipset's > SATA controllers on a couple of systems with single Xeon E3-1240v2 CPUs. > > Alex > > On 23/12/2015 6:39 PM, Lionel Bouton wrote: >> Le 23/12/2015 18:37, Mart van

Re: [ceph-users] ceph osd tree output

2016-01-07 Thread Mart van Santen
Hi, Do you have by any chance disabled automatic crushmap updates in your ceph config? osd crush update on start = false If this is the case, and you move disks around hosts, they won't update their position/host in the crushmap, even if the crushmap does not reflect reality. Regards, Mart

Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-16 Thread Van Leeuwen, Robert
you work for free ;-) Also reads from ceph are pretty fast compared to the biggest bottleneck: (small) sync writes. So it is debatable how much performance you would win except for some use-cases with lots of reads on very large data sets which are also very latency sensitive. Cheers,

Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-16 Thread Van Leeuwen, Robert
olumes survive a power outage. If you can survive missing that data you are probably better of running fully from ephemeral storage in the first place. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-29 Thread Van Leeuwen, Robert
On 3/27/16, 9:59 AM, "Ric Wheeler" wrote: >On 03/16/2016 12:15 PM, Van Leeuwen, Robert wrote: >>> My understanding of how a writeback cache should work is that it should >>> only take a few seconds for writes to be streamed onto the network and is >>&g

Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-29 Thread Van Leeuwen, Robert
>>> If you try to look at the rbd device under dm-cache from another host, of >>> course >>> any data that was cached on the dm-cache layer will be missing since the >>> dm-cache device itself is local to the host you wrote the data from >>> originally. >> And here it can (and probably will) go

Re: [ceph-users] ceph pg query hangs for ever

2016-03-30 Thread Mart van Santen
Hi there, With the help of a lot of people we were able to repair the PG and restored service. We will get back on this later with a full report for future reference. Regards, Mart On 03/30/2016 08:30 PM, Wido den Hollander wrote: > Hi, > > I have an issue with a Ceph cluster which I can't re

Re: [ceph-users] ceph pg query hangs for ever

2016-03-30 Thread Mart van Santen
1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore

[ceph-users] Image has watchers, but cannot determine why

2019-01-09 Thread Kenneth Van Alstyne
| grep -i qemu | grep -i rbd | grep -i 145 # ceph version ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe) # Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8

Re: [ceph-users] Image has watchers, but cannot determine why

2019-01-10 Thread Kenneth Van Alstyne
the watcher did indeed go away and I was able to remove the images. Very, very strange. (But situation solved… except I don’t know what the cause was, really.) Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue

[ceph-users] RBD Mirror Proxy Support?

2019-01-11 Thread Kenneth Van Alstyne
. Has anything been done in this regard? If not, is my best bet perhaps a tertiary clusters that both can reach and do one-way replication to? Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne
build out a test lab to see how that would work for us. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f: 571-266-3106 www.knightpoint.com<h

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne
In this case, I’m imagining Clusters A/B both having write access to a third “Cluster C”. So A/B -> C rather than A -> C -> B / B -> C -> A / A -> B-> C. I admit, in the event that I need to replicate back to either primary cluster, there may be challenges. Thanks, -

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne
D’oh! I was hoping that the destination pools could be unique names, regardless of the source pool name. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f: 571-266

[ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
B 0.02 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd 1 133 B 083 GiB 10 # df -h /var/lib/ceph/osd/cephfs-0/ Filesystem Size Used Avail Use% Mounted on 10.0.0.1:/ceph-remote 87G 12M 87G 1% /var/lib/ce

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
have that. The single OSD is simply due to the underlying cluster already either being erasure coded or replicated. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
mind — I just didn’t want to risk impacting the underlying cluster too much or hit any other caveats that perhaps someone else has run into before. I doubt many people have tried CephFS as a Filestore OSD since in general, it seems like a pretty silly idea. Thanks, -- Kenneth Van Alstyne

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne
I’d actually rather it not be an extra cluster, but can the destination pool name be different? If not, I have conflicting image names in the “rbd” pool on either side. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775

[ceph-users] Does "mark_unfound_lost delete" only delete missing/unfound objects of a PG

2019-01-22 Thread Mathijs van Veluw
Hello. I have a question about `ceph pg {pg.num} mark_unfound_lost delete`. Will this only delete objects which are unfound, or the whole PG which you put in as an argument? Objects (oid's) which i can see with `ceph pg {pg.num} list_missing`? So in the case bellow, would it remove the object "rbd_

Re: [ceph-users] Does "mark_unfound_lost delete" only delete missing/unfound objects of a PG

2019-01-25 Thread Mathijs van Veluw
It has been resolved. It seems that it in fact only removes the objects which you list with the list_missing option. On Tue, Jan 22, 2019 at 9:42 AM Mathijs van Veluw < mathijs.van.ve...@gmail.com> wrote: > Hello. > I have a question about `ceph pg {pg.num} mark_unfound_lost del

[ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
Hello, I'am working for BELNET the Belgian Natioanal Research Network We currently a manage a luminous ceph cluster on ubuntu 16.04 with 144 hdd osd spread across two data centers with 6 osd nodes on each datacenter. Osd(s) are 4 TB sata disk. Last week we had a network incident and the link betw

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
From: Sage Weil Sent: 03 February 2019 18:25 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. On Sun, 3 Feb 2019, Philippe Van Hecke wrote: > Hello, > I'

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
ere https://filesender.belnet.be/?s=download&token=e2b1fdbc-0739-423f-9d97-0bd258843a33 file ceph-objectstore-tool-export-remove.txt Kr Philippe. From: Sage Weil Sent: 04 February 2019 06:59 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com;

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
to forget that it inform me about it immediately :-) Kr Philippe. On Mon, 4 Feb 2019, Sage Weil wrote: > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, First of all tanks for your help > > > > Please find here > > https://filesender.belnet.be/?s=d

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
65 2019-02-01 12:48:41.343144 66131'19810044 2019-01-30 11:44:36.006505 cp done. So i can make ceph-objecstore-tool --op remove command ? From: Sage Weil Sent: 04 February 2019 07:26 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet S

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
sh_remove_pgs 11.182_head removing 11.182 Remove successful So now i suppose i restart the osd and see From: Sage Weil Sent: 04 February 2019 07:37 To: Philippe Van Hecke Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluste

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-03 Thread Philippe Van Hecke
ippe. From: Philippe Van Hecke Sent: 04 February 2019 07:42 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-04 Thread Philippe Van Hecke
Hi, Seem that the recovery process stop and get back to the same situation as before. I hope that the log can provide more info. Any way thanks already for your assistance. Kr Philippe. From: Philippe Van Hecke Sent: 04 February 2019 07:53 To: Sage

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-04 Thread Philippe Van Hecke
quot;osd is down" }, { "osd": "51", "status": "osd is down" }, { "osd": "63", "status&

Re: [ceph-users] Luminous cluster in very bad state need some assistance.

2019-02-10 Thread Philippe Van Hecke
From: Philippe Van Hecke Sent: 04 February 2019 07:27 To: Sage Weil Cc: ceph-users@lists.ceph.com; Belnet Services; ceph-de...@vger.kernel.org Subject: Re: [ceph-users] Luminous cluster in very bad state need some assistance. Sage, Not during the network flap or before flap , but after i had al

Re: [ceph-users] Nautilus upgrade but older releases reported by features

2019-03-27 Thread Kenneth Van Alstyne
705696d4fe619afc) nautilus (stable)": 1 }, "mds": { "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (stable)": 1 }, "rgw": { "ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) nautilus (st

Re: [ceph-users] VM management setup

2019-04-05 Thread Kenneth Van Alstyne
5.6.1 or wait for 5.8.1 to be released since the issues have already been fixed upstream. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f: 571-266-3106

Re: [ceph-users] Data distribution question

2019-04-30 Thread Kenneth Van Alstyne
Shain: Have you looked into doing a "ceph osd reweight-by-utilization” by chance? I’ve found that data distribution is rarely perfect and on aging clusters, I always have to do this periodically. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Dis

Re: [ceph-users] Data distribution question

2019-04-30 Thread Kenneth Van Alstyne
Unfortunately it looks like he’s still on Luminous, but if upgrading is an option, the options are indeed significantly better. If I recall correctly, at least the balancer module is available in Luminous. Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service

Re: [ceph-users] Broken mirrors: hk, us-east, de, se, cz, gigenet

2019-06-16 Thread Mart van Santen
hk: I have limited disk capacity and the disk filled up and at this point I do not have any way to extend capacity. I will recheck the monitoring of this. I've now excluded archive, hammer and giant for now. So at least newer versions are there. It is currently syncing, which will take some t

[ceph-users] Delay time in Multi-site sync

2019-08-06 Thread Hoan Nguyen Van
Hi all. I want to delay time for sync process from primary zone to secondary zone. If some one want to delete my data i have enough time to process. How i can do it. Config some options, install more proxy. Any solutions. Thanks. Regards ___ ceph-user

[ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-13 Thread Kenneth Van Alstyne
done rbd size: 3 data size: 3 metadata size: 3 .rgw.root size: 3 default.rgw.control size: 3 default.rgw.meta size: 3 default.rgw.log size: 3 default.rgw.buckets.index size: 3 default.rgw.buckets.data size: 3 default.rgw.buckets.non-ec size: 3 Thanks, -- K

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-14 Thread Kenneth Van Alstyne
Got it! I can calculate individual clone usage using “rbd du”, but does anything exist to show total clone usage across the pool? Otherwise it looks like phantom space is just missing. Thanks, -- Kenneth Van Alstyne Systems Architect M: 228.547.8045 15052 Conference Center Dr, Chantilly, VA

[ceph-users] Panic in kernel CephFS client after kernel update

2019-10-01 Thread Kenneth Van Alstyne
ashed machine and to avoid attaching an image, I’ll link to where they are: http://kvanals.kvanals.org/.ceph_kernel_panic_images/ Am I way off base or has anyone else run into this issue? Thanks, -- Kenneth Van Alstyne Systems Architect M: 228.547.8045 15052

Re: [ceph-users] Panic in kernel CephFS client after kernel update

2019-10-05 Thread Kenneth Van Alstyne
Thanks! I’ll remove my patch from my local build of the 4.19 kernel and upgrade to 4.19.77. Appreciate the quick fix. Thanks, -- Kenneth Van Alstyne Systems Architect M: 228.547.8045 15052 Conference Center Dr, Chantilly, VA 20151 perspecta On Oct 5, 2019, at 7:29 AM, Ilya Dryomov

Re: [ceph-users] How to backup mon-data?

2014-05-23 Thread Dan Van Der Ster
data to a safe place. Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 23 May 2014, at 15:45, Fabian Zimmermann wrote: > Hello, > > I’m running a 3 node cluster with 2 hdd/osd and one mon on each node. > Sadly the fsyncs done by mon-process

Re: [ceph-users] Ceph and low latency kernel

2014-05-25 Thread Dan Van Der Ster
I very briefly tried kernel-rt from RH MRG, and it didn't make any noticeable difference. Though I didn't spend any time tuning things. Cheers, Dan On May 25, 2014 11:04 AM, Stefan Priebe - Profihost AG wrote: Hi, has anybody ever tried to use a low latency kernel for ceph? Does it make any d

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
im_sleep = … ? Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
On 04 Jun 2014, at 16:06, Sage Weil wrote: > On Wed, 4 Jun 2014, Dan Van Der Ster wrote: >> Hi Sage, all, >> >> On 21 May 2014, at 22:02, Sage Weil wrote: >> >>> * osd: allow snap trim throttling with simple delay (#6278, Sage Weil) >> >> D

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
On 04 Jun 2014, at 16:06, Sage Weil wrote: > You can adjust this on running OSDs with something like 'ceph daemon > osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.* > injectargs -- --osd-snap-trim-sleep .01'. Thanks, trying that now. I noticed that using = 0.01 in ceph.conf

Re: [ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Dan Van Der Ster
faced this , though i have done several ceph cluster installation with package manager. I don’t want EPEL version of Ceph. You probably need to tweak the repo priorities. We use priority=30 for epel.repo, priority=5 for ceph.repo. Cheers, Dan -- Dan van der Ster || Data & Storage Servic

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Dan Van Der Ster
OTOH a disk/op thread is switching between scrubbing and client IO responsibilities, could Ceph use ioprio_set to change the io priorities on the fly?? Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 10 Jun 2014, at 00:22, Craig Lewis mailto

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-06-11 Thread Dan Van Der Ster
Hi Greg, This tracker issue is relevant: http://tracker.ceph.com/issues/7288 Cheers, Dan On 11 Jun 2014, at 00:30, Gregory Farnum wrote: > Hey Mike, has your manual scheduling resolved this? I think I saw > another similar-sounding report, so a feature request to improve scrub > scheduling would

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-11 Thread Dan Van Der Ster
On 10 Jun 2014, at 11:59, Dan Van Der Ster wrote: > One idea I had was to check the behaviour under different disk io schedulers, > trying exploit thread io priorities with cfq. So I have a question for the > developers about using ionice or ioprio_set to lower the IO prioriti

Re: [ceph-users] Throttle pool pg_num/pgp_num increase impact

2014-07-08 Thread Dan Van Der Ster
Hi Greg, We're also due for a similar splitting exercise in the not too distant future, and will also need to minimize the impact on latency. In addition to increasing pg_num in small steps and using a minimal max_backfills/recoveries configuration, I was planning to increase pgp_num very slowl

Re: [ceph-users] Hang of ceph-osd -i (adding an OSD)

2014-07-09 Thread Dan Van Der Ster
Hi, On 09 Jul 2014, at 14:44, Robert van Leeuwen wrote: >> I cannot add a new OSD to a current Ceph cluster. >> It just hangs, here is the debug log: >> This is ceph 0.72.1 on CentOS. > > Found the issue: > Although I installed the specific ceph (0.72.1) versio

Re: [ceph-users] Hang of ceph-osd -i (adding an OSD)

2014-07-09 Thread Dan Van Der Ster
On 09 Jul 2014, at 15:30, Robert van Leeuwen wrote: >> Which leveldb from where? 1.12.0-5 that tends to be in el6/7 repos is broken >> for Ceph. >> You need to remove the “basho fix” patch. >> 1.7.0 is the only readily available version that works, though it is so old

Re: [ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Dan Van Der Ster
", } But I don’t remember when or how we discovered this, and google isn’t helping. I suggest that this should be added to ceph.com docs. Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 08 Aug 2014, at 10:46, Christian Kauhaus wrote: > Hi, >

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
. BTW, do you throttle your clients? We found that its absolutely necessary, since without a throttle just a few active VMs can eat up the entire iops capacity of the cluster. Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 08 Aug 2014, at 13:51, And

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
15:44, Dan Van Der Ster mailto:daniel.vanders...@cern.ch>> wrote: Hi, Here’s what we do to identify our top RBD users. First, enable log level 10 for the filestore so you can see all the IOs coming from the VMs. Then use a script like this (used on a dumpling cluster): https://github.

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-11 Thread Dan Van Der Ster
Hi, I changed the script to be a bit more flexible with the osd path. Give this a try again: https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl Cheers, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 11 Aug 2014, at 12:48, Andrija P

Re: [ceph-users] Serious performance problems with small file writes

2014-08-20 Thread Dan Van Der Ster
it might be that something else is reading the disks heavily. One thing to check is updatedb — we had to disable it from indexing /var/lib/ceph on our OSDs. Best Regards, Dan -- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 20 Aug 2014, at 16:39, Hugo Mills w

Re: [ceph-users] Serious performance problems with small file writes

2014-08-20 Thread Dan Van Der Ster
e committing max ops/bytes, and the filestore wbthrottle xfs * options. (I’m not going to publish exact configs here because I haven’t finished tuning yet). Cheers, Dan Thanks a lot!! Best regards, German Anders On Wednesday 20/08/2014 at 11:51, Dan Van Der Ster wrote: Hi, Do y

Re: [ceph-users] Serious performance problems with small file writes

2014-08-21 Thread Dan Van Der Ster
Hi Hugo, On 20 Aug 2014, at 17:54, Hugo Mills wrote: >> What are you using for OSD journals? > > On each machine, the three OSD journals live on the same ext4 > filesystem on an SSD, which is also the root filesystem of the > machine. > >> Also check the CPU usage for the mons and osds... >

Re: [ceph-users] MON running 'ceph -w' doesn't see OSD's booting

2014-08-21 Thread Dan Van Der Ster
Hi, You only have one OSD? I’ve seen similar strange things in test pools having only one OSD — and I kinda explained it by assuming that OSDs need peers (other OSDs sharing the same PG) to behave correctly. Install a second OSD and see how it goes... Cheers, Dan On 21 Aug 2014, at 02:59, Bruc

Re: [ceph-users] Serious performance problems with small file writes

2014-08-21 Thread Dan Van Der Ster
Hi Hugo, On 21 Aug 2014, at 14:17, Hugo Mills wrote: > > Not sure what you mean about colocated journal/OSD. The journals > aren't on the same device as the OSDs. However, all three journals on > each machine are on the same SSD. I obviously didn’t drink enough coffee this morning. I read y

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-29 Thread Dan Van Der Ster
-- Dan van der Ster || Data & Storage Services || CERN IT Department -- On 28 Aug 2014, at 18:11, Sebastien Han wrote: > Hey all, > > It has been a while since the last thread performance related on the ML :p > I’ve been running some experiment to see how much I can get from an

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-02 Thread Dan Van Der Ster
Hi Sebastien, That sounds promising. Did you enable the sharded ops to get this result? Cheers, Dan > On 02 Sep 2014, at 02:19, Sebastien Han wrote: > > Mark and all, Ceph IOPS performance has definitely improved with Giant. > With this version: ceph version 0.84-940-g3215c52 > (3215c520e1306

[ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan Van Der Ster
would perform adequately, that’d give us quite a few SSDs to build a dedicated high-IOPS pool. I’d also appreciate any other suggestions/experiences which might be relevant. Thanks! Dan -- Dan van der Ster || Data & Storage Services || CERN IT D

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan Van Der Ster
use some performance tweaking with small reads before it will be really viable for us. Robert LeBlanc On Thu, Sep 4, 2014 at 10:21 AM, Dan Van Der Ster mailto:daniel.vanders...@cern.ch>> wrote: Dear Cephalopods, In a few weeks we will receive a batch of 200GB Intel DC S3700’s to augmen

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan Van Der Ster
s of Ceph (n-1 or n-2) will be a bit too old from where we want to be, which I'm sure will work wonderfully on Red Hat, but how will n.1, n.2 or n.3 run? Robert LeBlanc On Thu, Sep 4, 2014 at 11:22 AM, Dan Van Der Ster mailto:daniel.vanders...@cern.ch>> wrote: Hi Robert, That&#x

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
get our Infiniband gear to test AccelIO with Ceph.   I'm interested to see what you decide to do and what your results are.   On Thu, Sep 4, 2014 at 12:12 PM, Dan Van Der Ster wrote:   I've just been reading the bcache docs. It's a pity the mirrored writes aren&

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
Hi Stefan, September 4 2014 9:13 PM, "Stefan Priebe" wrote: > Hi Dan, hi Robert, > > Am 04.09.2014 21:09, schrieb Dan van der Ster: > >> Thanks again for all of your input. I agree with your assessment -- in >> our cluster we avg <3ms for a random (hot) 4k

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
oticeable in our cluster even now. 24 _was_ noticeable, so I maybe 5 is doable. Thanks for the input, Dan > Anyways, we have pretty low cluster usage but in our experience ssd seem to > handle the constant > load very well. > > Cheers, > Martin > > On Thu, Sep 4, 201

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
Hi Craig, September 4 2014 11:50 PM, "Craig Lewis" wrote: > On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster > wrote: > >> 1) How often are DC S3700's failing in your deployments? > > None of mine have failed yet. I am planning to monitor the wear

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
Hi Christian, > On 05 Sep 2014, at 03:09, Christian Balzer wrote: > > > Hello, > > On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote: > >> On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster >> wrote: >> >>> >>> >>> 1) How

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
> On 05 Sep 2014, at 10:30, Nigel Williams wrote: > > On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster > wrote: >>> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >>> You might want to look into cache pools (and dedicated SSD servers with >>> fast con

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
> On 05 Sep 2014, at 11:04, Christian Balzer wrote: > > > Hello Dan, > > On Fri, 5 Sep 2014 07:46:12 + Dan Van Der Ster wrote: > >> Hi Christian, >> >>> On 05 Sep 2014, at 03:09, Christian Balzer wrote: >>> >>> >>>

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan van der Ster
Hi Christian, Let's keep debating until a dev corrects us ;) September 6 2014 1:27 PM, "Christian Balzer" wrote: > On Fri, 5 Sep 2014 09:42:02 + Dan Van Der Ster wrote: > >>> On 05 Sep 2014, at 11:04, Christian Balzer wrote: >>> >>> On

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan van der Ster
September 6 2014 4:01 PM, "Christian Balzer" wrote: > On Sat, 6 Sep 2014 13:07:27 + Dan van der Ster wrote: > >> Hi Christian, >> >> Let's keep debating until a dev corrects us ;) > > For the time being, I give the recent: > > htt

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan Van Der Ster
nsidered RAID 5 over your SSDs? Practically speaking, there's no performance downside to RAID 5 when your devices aren't IOPS-bound. On Sat Sep 06 2014 at 8:37:56 AM Christian Balzer mailto:ch...@gol.com>> wrote: On Sat, 6 Sep 2014 14:50:20 + Dan van der Ster wrote: > Se

Re: [ceph-users] SSD journal deployment experiences

2014-09-08 Thread Dan Van Der Ster
Hi Scott, > On 06 Sep 2014, at 20:39, Scott Laird wrote: > > IOPS are weird things with SSDs. In theory, you'd see 25% of the write IOPS > when writing to a 4-way RAID5 device, since you write to all 4 devices in > parallel. Except that's not actually true--unlike HDs where an IOP is an > I

Re: [ceph-users] NAS on RBD

2014-09-09 Thread Dan Van Der Ster
Hi Blair, > On 09 Sep 2014, at 09:05, Blair Bethwaite wrote: > > Hi folks, > > In lieu of a prod ready Cephfs I'm wondering what others in the user > community are doing for file-serving out of Ceph clusters (if at all)? > > We're just about to build a pretty large cluster - 2PB for file-based

Re: [ceph-users] NAS on RBD

2014-09-09 Thread Dan Van Der Ster
> On 09 Sep 2014, at 16:39, Michal Kozanecki wrote: > On 9 September 2014 08:47, Blair Bethwaite wrote: >> On 9 September 2014 20:12, Dan Van Der Ster >> wrote: >>> One thing I’m not comfortable with is the idea of ZFS checking the data in >>> additio

Re: [ceph-users] Ceph general configuration questions

2014-09-16 Thread Dan Van Der Ster
Hi, On 16 Sep 2014, at 16:46, shiva rkreddy mailto:shiva.rkre...@gmail.com>> wrote: 2. Has any one used SSD devices for Monitors. If so, can you please share the details ? Any specific changes to the configuration files? We use SSDs on our monitors — a spinning disk was not fast enough for lev

Re: [ceph-users] Still seing scrub errors in .80.5

2014-09-16 Thread Dan Van Der Ster
Hi Greg, I believe Marc is referring to the corruption triggered by set_extsize on xfs. That option was disabled by default in 0.80.4... See the thread "firefly scrub error". Cheers, Dan From: Gregory Farnum Sent: Sep 16, 2014 8:15 PM To: Marc Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-u

Re: [ceph-users] Ceph general configuration questions

2014-09-16 Thread Dan Van Der Ster
lse or doesn't matter? I think it doesn’t matter. We use xfs. Cheers, Dan On Tue, Sep 16, 2014 at 10:15 AM, Dan Van Der Ster mailto:daniel.vanders...@cern.ch>> wrote: Hi, On 16 Sep 2014, at 16:46, shiva rkreddy mailto:shiva.rkre...@gmail.com>> wrote: 2. Has any one used S

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Dan Van Der Ster
Hi Florian, > On 17 Sep 2014, at 17:09, Florian Haas wrote: > > Hi Craig, > > just dug this up in the list archives. > > On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis > wrote: >> In the interest of removing variables, I removed all snapshots on all pools, >> then restarted all ceph daemons at

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Dan Van Der Ster
er call to e.g 16, or to fix the loss of purged_snaps after backfilling. Actually, probably both of those are needed. But a real dev would know better. Cheers, Dan From: Florian Haas Sent: Sep 17, 2014 5:33 PM To: Dan Van Der Ster Cc: Craig Lewis ;ceph-users@lists.ceph.com Subject: Re: [ceph-

Re: [ceph-users] Monitor Restart triggers half of our OSDs marked down

2015-02-05 Thread Dan van der Ster
Hi, We also have seen this once after upgrading to 0.80.8 (from dumpling). Last week we had a network outage which marked out around 1/3rd of our OSDs. The outage lasted less than a minute -- all the OSDs were brought up once the network was restored. Then 30 minutes later I restarted one monitor

Re: [ceph-users] Monitor Restart triggers half of our OSDs marked down

2015-02-05 Thread Dan van der Ster
On Thu, Feb 5, 2015 at 9:54 AM, Sage Weil wrote: > On Thu, 5 Feb 2015, Dan van der Ster wrote: >> Hi, >> We also have seen this once after upgrading to 0.80.8 (from dumpling). >> Last week we had a network outage which marked out around 1/3rd of our >> OSDs. The outag

Re: [ceph-users] Monitor Restart triggers half of our OSDs marked down

2015-02-05 Thread Dan van der Ster
On Thu, Feb 5, 2015 at 9:54 AM, Sage Weil wrote: > On Thu, 5 Feb 2015, Dan van der Ster wrote: >> Hi, >> We also have seen this once after upgrading to 0.80.8 (from dumpling). >> Last week we had a network outage which marked out around 1/3rd of our >> OSDs. The outag

Re: [ceph-users] new ssd intel s3610, has somebody tested them ?

2015-02-20 Thread Dan van der Ster
Interesting, thanks for the link. I hope the quality on the 3610/3710 is as good as the 3700... we haven't yet seen a single failure in production. Cheers, Dan On Fri, Feb 20, 2015 at 8:06 AM, Alexandre DERUMIER wrote: > Hi, > > Intel has just released new ssd s3610: > > http://www.anandtech.c

[ceph-users] running giant/hammer mds with firefly osds

2015-02-20 Thread Dan van der Ster
Hi all, Back in the dumpling days, we were able to run the emperor MDS with dumpling OSDs -- this was an improvement over the dumpling MDS. Now we have stable firefly OSDs, but I was wondering if we can reap some of the recent CephFS developments by running a giant or ~hammer MDS with our firefly

Re: [ceph-users] running giant/hammer mds with firefly osds

2015-02-20 Thread Dan van der Ster
On Fri, Feb 20, 2015 at 7:56 PM, Gregory Farnum wrote: > On Fri, Feb 20, 2015 at 3:50 AM, Luis Periquito wrote: >> Hi Dan, >> >> I remember http://tracker.ceph.com/issues/9945 introducing some issues with >> running cephfs between different versions of giant/firefly. >> >> https://www.mail-archiv

Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Dan van der Ster
Hi Sage, We switched from apache+fastcgi to civetweb (+haproxy) around one month ago and so far it is working quite well. Just like GuangYang, we had seen many error 500's with fastcgi, but we never investigated it deeply. After moving to civetweb we don't get any errors at all no matter what load

Re: [ceph-users] v0.80.9 Firefly released

2015-03-11 Thread Dan van der Ster
Hi Sage, On Tue, Mar 10, 2015 at 8:34 PM, Sage Weil wrote: > Adjusting CRUSH maps > > > * This point release fixes several issues with CRUSH that trigger > excessive data migration when adjusting OSD weights. These are most > obvious when a very small weight change (e.g.

Re: [ceph-users] Creating and deploying OSDs in parallel

2015-03-31 Thread Dan van der Ster
Hi Somnath, We have deployed many machines in parallel and it generally works. Keep in mind that if you deploy many many (>1000) then this will create so many osdmap incrementals, so quickly, that the memory usage on the OSDs will increase substantially (until you reboot). Best Regards, Dan On Mon

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-20 Thread Dan van der Ster
Hi, This is similar to what you would observe if you hit the ulimit on open files/sockets in a Ceph client. Though that normally only affects clients in user mode, not the kernel. What are the ulimits of your rbd-fuse client? Also, you could increase the client logging debug levels to see why the c

<    1   2   3   4   5   6   7   8   9   >