[ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Götz Reinicke - IT Koordinator
Hi folks, we plan to use more ssd OSDs in our first cluster layout instead of SAS osds. (more IO is needed than space) short question: What would influence the performance more? more Cores or more GHz/Core. Or is it as always: Depeds on the total of OSDs/nodes/repl-level/etc ... :) If needed, I

[ceph-users] how to use the setomapval to change rbd size info?

2016-01-20 Thread 张鹏
i want change the omapval of a rbd size so i do some thing like : 1、create a rbd name zp3 with size 10G [root@lab8106 rbdre]# rbd create zp3 --size 10G 2、see rbd information [root@lab8106 rbdre]# rbd info zp3 rbd image 'zp3': size 10240 MB in 2560 objects order 22 (4096 kB objects) block_name_pr

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Christian Balzer
Hello, On Wed, 20 Jan 2016 10:01:19 +0100 Götz Reinicke - IT Koordinator wrote: > Hi folks, > > we plan to use more ssd OSDs in our first cluster layout instead of SAS > osds. (more IO is needed than space) > > short question: What would influence the performance more? more Cores or > more GHz

Re: [ceph-users] bucket type and crush map

2016-01-20 Thread Ivan Grcic
Hi Pedro, you have to take your pool size into account, which is probably 3. That way you get 840 * 3 / 6 = 420 ( PGs * PoolSize / OSD Num ) Please read: http://docs.ceph.com/docs/master/rados/operations/placement-groups/#choosing-the-number-of-placement-groups Regards, Ivan On Mon, Jan 18,

Re: [ceph-users] how to use the setomapval to change rbd size info?

2016-01-20 Thread Ilya Dryomov
On Wed, Jan 20, 2016 at 10:48 AM, 张鹏 wrote: > i want change the omapval of a rbd size so i do some thing like : > > 1、create a rbd name zp3 with size 10G > [root@lab8106 rbdre]# rbd create zp3 --size 10G > > 2、see rbd information > [root@lab8106 rbdre]# rbd info zp3 > rbd image 'zp3': > size 1024

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Tomasz Kuzemko
Hi, my team did some benchmarks in the past to answer this question. I don't have results at hand, but conclusion was that it depends on how many disks/OSDs you have in a single host: above 9 there was more benefit from more cores than GHz (6-core 3.5GHz vs 10-core 2.4GHz AFAIR). -- Tomasz Kuzemko

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 20 January 2016 10:31 > To: ceph-us...@ceph.com > Subject: Re: [ceph-users] SSD OSDs - more Cores or more GHz > > > Hello, > > On Wed, 20 Jan 2016 10:01:19 +0100 G

Re: [ceph-users] CentOS 7 iscsi gateway using lrbd

2016-01-20 Thread Nick Fisk
Thanks for your input Mike, a couple of questions if I may 1. Are you saying that this rbd backing store is not in mainline and is only in SUSE kernels? Ie can I use this lrbd on Debian/Ubuntu/CentOS? 2. Does this have any positive effect on the abort/reset death loop a number of us were seeing

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Oliver Dzombic
Hi, Cores > Frequency If you think about recovery / scrubbing tasks its better when a cpu core can be assigned to do this. Compared to a situation where the same cpu core needs to recovery/scrub and still deliver the productive content at the same time. The more you can create a situation where

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Jan Schermer
The OSD is able to use more than one core to do the work, so increasing the number of cores will increase throughput. However, if you care about latency then that is always tied to speed=frequency. If the question was "should I get 40GHz in 8 cores or in 16 cores" then the answer will always be

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Jan Schermer
This is very true, but do you actually exclusively pin the cores to the OSD daemons so they don't interfere? I don't think may people do that, it wouldn't work with more than a handful of OSDs. The OSD might typicaly only need <100% of one core, but during startup or some reshuffling it's benefi

[ceph-users] S3 upload to RadosGW slows after few chunks

2016-01-20 Thread Rishiraj Rana
Hey guys, I am having an s3 upload to ceph issue where in the upload seems to crawl after the first few chunks for a multipart upload. The test file is 38M in size and the upload was tried with s3 default chunk size at 15M and then tried again with chunk size set to 5M and then was tested again

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Oliver Dzombic
Hi Jan, actually the linux kernel does this automatically anyway ( sending new processes to "empty/low used" cores ). A single scrubbing/recovery or what ever process wont take more than 100% CPU ( one core ) because technically this processes are not able to run multi thread. Of course, if you

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Jan Schermer
I'm using Ceph with all SSDs, I doubt you have to worry about speed that much with HDD (it will be abysmall either way). With SSDs you need to start worrying about processor caches and memory colocation in NUMA systems, linux scheduler is not really that smart right now. Yes, the process will get i

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Wade Holler
Great commentary. While it is fundamentally true that higher clock speed equals lower latency, I'm my practical experience we are more often interested in latency at the concurrency profile of the applications. So in this regard I favor more cores when I have to choose, such that we can support m

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Oliver Dzombic
Hi, to be honest, i never made real benchmarks about that. But to me, i doubt that the higher frequency of a cpu will have a "real" impact on ceph's performance. I mean, yes, mathematically, just like Wade pointed out, its true. > frequency = < latency But when we compare CPU's of the same mode

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Mark Nelson
It depends is the right answer imho. There are advantages to building smaller single-socket high frequency nodes. The CPUs are cheap which helps off-set the lower density node cost, and as has been mentioned in this thread you don't have to deal with NUMA pinning and other annoying complicati

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Jan Schermer
Let's separate those issues 1) NUMA, memory colocation, CPU-ethernet latencies, this is all pretty minor _until_ you hit some limit this is something quite different tho 2) performance of a single IO hitting an OSD. In my case (with old CEPH release, so possibly less optimal codepath) my IO l

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Götz Reinicke - IT Koordinator
Am 20.01.16 um 11:30 schrieb Christian Balzer: > > Hello, > > On Wed, 20 Jan 2016 10:01:19 +0100 Götz Reinicke - IT Koordinator wrote: > >> Hi folks, >> >> we plan to use more ssd OSDs in our first cluster layout instead of SAS >> osds. (more IO is needed than space) >> >> short question: What w

[ceph-users] How to set a new Crushmap in production

2016-01-20 Thread Vincent Godin
Hi, I need to import a new crushmap in production (the old one is the default one) to define two datacenters and to isolate SSD from SATA disk. What is the best way to do this without starting an hurricane on the platform ? Till now, i was just using hosts (SATA OSD) on one datacenter with the de

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Nick Fisk
See this benchmark I did last year http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/ > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Oliver Dzombic > Sent: 20 January 2016 13:33 > To: ceph-us...@ceph.com > Subject: Re: [cep

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Mark Nelson
Excellent testing Nick! Mark On 01/20/2016 08:18 AM, Nick Fisk wrote: See this benchmark I did last year http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/ -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Oliver Dzombic Sent:

Re: [ceph-users] CRUSH Rule Review - Not replicating correctly

2016-01-20 Thread deeepdish
Hi Robert, Just wanted to let you know that after applying your crush suggestion and allowing cluster to rebalance itself, I now have symmetrical data distribution. In keeping 5 monitors my rationale is availability. I have 3 compute nodes + 2 storage nodes. I was thinking that making all

[ceph-users] ceph fuse closing stale session while still operable

2016-01-20 Thread Oliver Dzombic
Hi, i am testing on centos 6 x64 minimal install. i am mounting successfully: ceph-fuse -m 10.0.0.1:6789,10.0.0.2:6789,10.0.0.3:6789,10.0.0.4:6789 /ceph-storage/ [root@cn201 log]# df Filesystem1K-blocksUsed Available Use% Mounted on /dev/sda1 74454192 1228644

[ceph-users] jemalloc-enabled packages on trusty?

2016-01-20 Thread Zoltan Arnold Nagy
Hi, Has someone published prebuilt debs for trusty from hammer with jemalloc compiled-in instead of tcmalloc or does everybody need to compile it themselves? :-) Cheers, Zoltan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.co

[ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Wido den Hollander
Hello, I have an issue with a (not in production!) Ceph cluster which I'm trying to resolve. On Friday the network links between the racks failed and this caused all monitors to loose connection. Their leveldb stores kept growing and they are currently 100% full. They all have a few hunderd MB l

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Zoltan Arnold Nagy
Hi Wido, So one out of the 5 monitors are running fine then? Did that have more space for it’s leveldb? > On 20 Jan 2016, at 16:15, Wido den Hollander wrote: > > Hello, > > I have an issue with a (not in production!) Ceph cluster which I'm > trying to resolve. > > On Friday the network links

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Nick Fisk
Is there anything you can do with a USB key/NFS mount? Ie copy leveldb on to it, remount in proper location, compact and then copy back to primary disk? > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Wido den Hollander > Sent: 20 January 2

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Joao Eduardo Luis
On 01/20/2016 03:15 PM, Wido den Hollander wrote: > Hello, > > I have an issue with a (not in production!) Ceph cluster which I'm > trying to resolve. > > On Friday the network links between the racks failed and this caused all > monitors to loose connection. > > Their leveldb stores kept growin

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Wido den Hollander > Sent: 20 January 2016 15:27 > To: Zoltan Arnold Nagy > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start > > On

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Wido den Hollander
On 01/20/2016 04:22 PM, Zoltan Arnold Nagy wrote: > Hi Wido, > > So one out of the 5 monitors are running fine then? Did that have more space > for it’s leveldb? > Yes. That was at 99% full and by cleaning some stuff in /var/cache and /var/log I was able to start it. It compacted the levelDB d

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Somnath Roy
Yes, thanks for the data.. BTW, Nick, do we know what is more important more cpu core or more frequency ? For example, We have Xeon cpus available with a bit less frequency but with more cores /socket , so, which one we should be going with for OSD servers ? Thanks & Regards Somnath -Origina

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Wido den Hollander
On 01/20/2016 04:25 PM, Joao Eduardo Luis wrote: > On 01/20/2016 03:15 PM, Wido den Hollander wrote: >> Hello, >> >> I have an issue with a (not in production!) Ceph cluster which I'm >> trying to resolve. >> >> On Friday the network links between the racks failed and this caused all >> monitors to

Re: [ceph-users] How to set a new Crushmap in production

2016-01-20 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I'm not aware of a way of slowing things down other then modifying osd_max_backfills, osd_backfill_scan_{min,max}, and osd_recovery_max_activate as mentioned in [1]. The nature of injecting a new CRUSH map us usually the result of several changes and

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Nick Fisk
Hi Somnath, Unfortunately I don't have any figures or really any way to generate them to answer that question. However my gut feeling is that it is heavily dependent on the type of workload. Faster Ghz will lower per request latency but only up to the point where the CPU's start getting busy.

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I did some tests, this is a single OSD server with I think 2 Intel 3500 drives and replication 1 and the results are in IOPs. I think the client was on a different host, but it has been a long time since I did this test. I adjusted the number of core

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread seapasu...@uchicago.edu
On 1/19/16 4:00 PM, Yehuda Sadeh-Weinraub wrote: On Fri, Jan 15, 2016 at 5:04 PM, seapasu...@uchicago.edu wrote: I have looked all over and I do not see any explicit mention of "NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959" in the logs nor do I see a timestamp from November 4th althou

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Zoltan Arnold Nagy
Wouldn’t actually blowing away the other monitors then recreating them from scratch solve the issue? Never done this, just thinking out loud. It would grab the osdmap and everything from the other monitor and form a quorum, wouldn’t it? > On 20 Jan 2016, at 16:26, Wido den Hollander wrote: >

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread Yehuda Sadeh-Weinraub
On Wed, Jan 20, 2016 at 10:43 AM, seapasu...@uchicago.edu wrote: > > > On 1/19/16 4:00 PM, Yehuda Sadeh-Weinraub wrote: >> >> On Fri, Jan 15, 2016 at 5:04 PM, seapasu...@uchicago.edu >> wrote: >>> >>> I have looked all over and I do not see any explicit mention of >>> "NWS_NEXRAD_NXL2DP_PAKC_2015

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Wido den Hollander
On 01/20/2016 08:01 PM, Zoltan Arnold Nagy wrote: > Wouldn’t actually blowing away the other monitors then recreating them > from scratch solve the issue? > > Never done this, just thinking out loud. It would grab the osdmap and > everything from the other monitor and form a quorum, wouldn’t it? >

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread seapasu...@uchicago.edu
So is there any way to prevent this from happening going forward? I mean ideally this should never be possible, right? Even with a complete object that is 0 bytes it should be downloaded as 0 bytes and have a different md5sum and not report as 7mb? On 1/20/16 1:30 PM, Yehuda Sadeh-Weinraub wr

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread Yehuda Sadeh-Weinraub
We'll need to confirm that this is the actual issue, and then have it fixed. It would be nice to have some kind of a unitest that reproduces it. Yehuda On Wed, Jan 20, 2016 at 1:34 PM, seapasu...@uchicago.edu wrote: > So is there any way to prevent this from happening going forward? I mean > ide

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread seapasu...@uchicago.edu
I'm working on getting the code they used and trying different timeouts in my multipart upload code. Right now I have not created any new 404 keys though :-( On 1/20/16 3:44 PM, Yehuda Sadeh-Weinraub wrote: We'll need to confirm that this is the actual issue, and then have it fixed. It would b

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-20 Thread Yehuda Sadeh-Weinraub
Keep in mind that if the problem is that the tail is being sent to garbage collection, you'll only see the 404 after a few hours. A shorter way to check it would be by listing the gc entries (with --include-all). Yehuda On Wed, Jan 20, 2016 at 1:52 PM, seapasu...@uchicago.edu wrote: > I'm workin

Re: [ceph-users] Ceph monitors 100% full filesystem, refusing start

2016-01-20 Thread Zoltan Arnold Nagy
Wouldn’t this be the same operation as growing the number of monitors from let’s say 3 to 5 in an already running, production cluster, which AFAIK is supported? Just in this case it’s not 3->5 but 1->X :) > On 20 Jan 2016, at 22:04, Wido den Hollander wrote: > > On 01/20/2016 08:01 PM, Zoltan

Re: [ceph-users] ceph fuse closing stale session while still operable

2016-01-20 Thread Gregory Farnum
On Wed, Jan 20, 2016 at 6:58 AM, Oliver Dzombic wrote: > Hi, > > i am testing on centos 6 x64 minimal install. > > i am mounting successfully: > > ceph-fuse -m 10.0.0.1:6789,10.0.0.2:6789,10.0.0.3:6789,10.0.0.4:6789 > /ceph-storage/ > > > [root@cn201 log]# df > Filesystem1K-blocksU

Re: [ceph-users] ceph fuse closing stale session while still operable

2016-01-20 Thread Oliver Dzombic
Hi Greg, thank you for your time! #ceph-s cluster health HEALTH_WARN 62 requests are blocked > 32 sec noscrub,nodeep-scrub flag(s) set monmap e9: 4 mons at {ceph1=10.0.0.1:6789/0,ceph2=10.0.0.2:6789/0,ceph3=10.0.0.3:6789/0,ceph4=10.0.0.4:6789/0}

Re: [ceph-users] ceph fuse closing stale session while still operable

2016-01-20 Thread Gregory Farnum
On Wed, Jan 20, 2016 at 4:03 PM, Oliver Dzombic wrote: > Hi Greg, > > thank you for your time! > > #ceph-s > >cluster > health HEALTH_WARN > 62 requests are blocked > 32 sec > noscrub,nodeep-scrub flag(s) set > monmap e9: 4 mons at > {ceph1=10.0.0.1:6789/0,ce

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-20 Thread Francois Lafont
Hi, On 19/01/2016 07:24, Adam Tygart wrote: > It appears that with --apparent-size, du adds the "size" of the > directories to the total as well. On most filesystems this is the > block size, or the amount of metadata space the directory is using. On > CephFS, this size is fabricated to be the siz

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-20 Thread Francois Lafont
On 21/01/2016 03:40, Francois Lafont wrote: > Ah ok, interesting. I have tested and I have noticed however that size > of a directory is not updated immediately. For instance, if I change > the size of the regular file in a directory (of cephfs) the size of the > size doesn't change immediately af

Re: [ceph-users] Infernalis, cephfs: difference between df and du

2016-01-20 Thread Gregory Farnum
On Wed, Jan 20, 2016 at 6:40 PM, Francois Lafont wrote: > Hi, > > On 19/01/2016 07:24, Adam Tygart wrote: >> It appears that with --apparent-size, du adds the "size" of the >> directories to the total as well. On most filesystems this is the >> block size, or the amount of metadata space the direc

Re: [ceph-users] CentOS 7 iscsi gateway using lrbd

2016-01-20 Thread Mike Christie
On 01/20/2016 06:07 AM, Nick Fisk wrote: > Thanks for your input Mike, a couple of questions if I may > > 1. Are you saying that this rbd backing store is not in mainline and is only > in SUSE kernels? Ie can I use this lrbd on Debian/Ubuntu/CentOS? The target_core_rbd backing store is not upstr

[ceph-users] Ceph scale testing

2016-01-20 Thread Somnath Roy
Hi, Here is the copy of the ppt I presented in today's performance meeting.. https://docs.google.com/presentation/d/1j4Lcb9fx0OY7eQlQ_iUI6TPVJ6t_orZWKJyhz0S_3ic/edit?usp=sharing Thanks & Regards Somnath ___ ceph-users mailing list ceph-users@lists.ceph.

Re: [ceph-users] Ceph scale testing

2016-01-20 Thread Alexandre DERUMIER
Thanks Somnath ! - Mail original - De: "Somnath Roy" À: "ceph-devel" , "ceph-users" Envoyé: Jeudi 21 Janvier 2016 05:03:59 Objet: Ceph scale testing Hi, Here is the copy of the ppt I presented in today's performance meeting.. https://docs.google.com/presentation/d/1j4Lcb9fx0OY7eQlQ_