Re: [ceph-users] Best layout for SSD & SAS OSDs

2015-09-07 Thread Christian Balzer
gle Intel SSD, DC or otherwise. Christian > Jan > > > > On 07 Sep 2015, at 05:53, Christian Balzer wrote: > > > > On Sat, 5 Sep 2015 07:13:29 -0300 German Anders wrote: > > > >> Hi Christian, > >> > >>Ok so would said that it'

Re: [ceph-users] XFS and nobarriers on Intel SSD

2015-09-07 Thread Christian Balzer
recommended that barriers are turned off as the drive has a > > safe cache (I am confident that the cache will write out to disk on > > power failure)? > > > > Has anyone else encountered this issue? > > > > Any info or sug

Re: [ceph-users] XFS and nobarriers on Intel SSD

2015-09-07 Thread Christian Balzer
ort: SUCCESS scmd(880fdc85b680) --- Note that on the un-patched node (DRBD replication target) I managed to trigger this bug 3 times in the same period. So unless Intel has something to say (and given that this happens with Samsungs as well), I'd still look beady eyed at LSI/Avago... Chri

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Christian Balzer
pe.com > > wsFcBAEBCAAQBQJV8gEnCRDmVDuy+mK58QAAQ7QQAJjm1tu9Tp8q+TPXS6k/ > +MXfpW28p54y67gfBcGHSOJd/VzJsIytFeO9Q5r6uA3U+JFvxVeN8Jpbp8qF > JyjAR2qttW5MnOcZm8Zf8VI6RVNfCXw9KIqCtO8ZWN89JKNg0ImXqMKOK5rL > wg1wuk+fFF8PvJlweQS9xOFdXgxfnMXlLfXoYccHzRsRyTHIixrVED1vWgAA > oLSOYySPaLTjJLfa

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Christian Balzer
16=3.2%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued: total=r=0/w=102400/d=0, short=r=0/w=0/d=0, > drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, > depth=16 > > Run status group 0 (all jobs): > WRITE: io=102400MB, aggrb=121882KB/s, minb=121882KB/s, m

Re: [ceph-users] Using OS disk (SSD) as journal for OSD

2015-09-12 Thread Christian Balzer
art End SizeFile system Name Flags > 1 1049kB 211MB 210MB ext4 boot > 2 211MB 21.2GB 21.0GB ext4 > 3 21.2GB 29.6GB 8389MB linux-swap(v1) > > There is enough space for many 5G journal patitions on sda -- Chri

Re: [ceph-users] Thumb rule for selecting memory for Ceph OSD node

2015-09-13 Thread Christian Balzer
sy? And unless you deploy like 10 of them initially, a node of that size going down will severely impact your cluster performance. > > So which rule should we considered that can stand true for a 12 OSD node > and even for 72 OSD node. 2GB per OSD plus OS/other needs, round up to what

Re: [ceph-users] XFS and nobarriers on Intel SSD

2015-09-14 Thread Christian Balzer
plan to update the > firmware on the remainder of the S3710 drives this week and also set > nobarriers. > > Regards, > > Richard > > > > On 8 September 2015 at 14:27, Richard Bade <mailto:hitr...@gmail.com> > wrote: > > Hi Christian, > > > &

Re: [ceph-users] SOLVED: CRUSH odd bucket affinity / persistence

2015-09-14 Thread Christian Balzer
osd.11 0.140 root=ssd” > >> > >> I’m able to verify that the OSD / MON host and another MON I have > >> running see the same CRUSH map. > >> > >> After rebooting OSD / MON host, both osd.10 and osd.11 become part of > >> the default bucket. How can I

Re: [ceph-users] question on reusing OSD

2015-09-16 Thread Christian Balzer
d the > > partition again automatically without reconfiguration) > > - start the OSD > > > > If you script this you should not have to use noout: the OSD should > > come back in a matter of seconds and the impact on the storage network > > minimal. > > > > Not

Re: [ceph-users] Lot of blocked operations

2015-09-17 Thread Christian Balzer
? > > > > Thanks for any help, > > > > Olivier > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
which, what FS are you using on your OSDs? > > > Le vendredi 18 septembre 2015 à 12:30 +0900, Christian Balzer a écrit : > > Hello, > > > > On Fri, 18 Sep 2015 02:43:49 +0200 Olivier Bonvalet wrote: > > > > The items below help, but be a s specific as pos

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
Hello, On Fri, 18 Sep 2015 10:35:37 +0200 Olivier Bonvalet wrote: > Le vendredi 18 septembre 2015 à 17:04 +0900, Christian Balzer a écrit : > > Hello, > > > > On Fri, 18 Sep 2015 09:37:24 +0200 Olivier Bonvalet wrote: > > > > > Hi, > > > > &g

Re: [ceph-users] Lot of blocked operations

2015-09-18 Thread Christian Balzer
; > Could this be caused by monitors? In my case lagging monitors can > > > > also cause slow requests (because of slow peering). Not sure if > > > > that's expected or not, but it of course doesn't show on the OSDs > > > > as > > > >

Re: [ceph-users] Ceph Storage Cluster on Amazon EC2 across different regions

2015-09-29 Thread Christian Balzer
0119 Berlin > > > > http://www.heinlein-support.de > > > > Tel: 030 / 405051-43 > > Fax: 030 / 405051-19 > > > > Zwangsangaben lt. §35a GmbHG: > > HRB 93818 B / Amtsgericht Berlin-Charlottenburg, > > Geschäftsführer: Peer Heinlein -- Sitz: Berlin &g

Re: [ceph-users] Predict performance

2015-10-02 Thread Christian Balzer
if you're reading just 8GB in your tests and that fits nicely in the page caches of the OSDs, it will be wire speed. >Should I configure a replica factor of 3? > If you value your data, which you will on a production server, then yes. This will of course cost you 1/3 of your

Re: [ceph-users] Predict performance

2015-10-02 Thread Christian Balzer
ive you a concrete example, on my test cluster I have 5 nodes, 4 > > HDDs/OSDs each and no journal SSDs. > > So that's in theory 100 IOPS per HDD, divided by 2 for the on-disk > > journal, divided by 3 for replication: > > 20*100/2/3=333 > > Which amazingly is what

Re: [ceph-users] Predict performance

2015-10-02 Thread Christian Balzer
ta set fits into the page > > > caches of your storage nodes, it will be fast, if everything needs > > > to be read from the HDDs, you're back to what these devices can do > > > (~100 IOPS per HDD). > > > > > > To give you a concrete example, on my te

[ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-06 Thread Christian Balzer
slow WRITES that really upset the VMs and the application they run. Clearly what I'm worried about here is that the old pool backfilling/recovering will be quite comatose (as mentioned above) during that time. Regards, Christian -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-06 Thread Christian Balzer
mon addr = 192.168.1.153:6789 > > > [osd] > > > [osd.0] > host = storageOne > > > [osd.1] > host = storageTwo > > > [osd.2] > host = storageFour > > > [osd.3] > host = storageLast >

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Christian Balzer
ones will be replaced eventually. Christian [snip] -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.cep

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Christian Balzer
Hello Udo, On Wed, 07 Oct 2015 11:40:11 +0200 Udo Lembke wrote: > Hi Christian, > > On 07.10.2015 09:04, Christian Balzer wrote: > > > > ... > > > > My main suspect for the excessive slowness are actually the Toshiba DT > > type drives used. > >

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread Christian Balzer
ian > Ceph stucks in creating the pgs forever. Those pgs are stuck in inactive > and unclean. And the Ceph pg query hangs forever. I googled this problem > and didn't get a clue. Is there anything I missed? > Any idea to help me? > > > -- > > Zhen Wang > >

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread Christian Balzer
d anything. Christian > I have four storage nodes. Each of them has two independent hard drive > to store data. One is 120GB SSD, and the other is 1TB HDD. I set the > weight of SSD is 0.1 and weight of HDD is 1.0. > > > > > > -- > > Zhen Wang > Shanghai Jia

Re: [ceph-users] Large LOG like files on monitor

2015-10-08 Thread Christian Balzer
e date of LOG is current) it obviously isn't safe to remove it. Christian > Regards, > Erwin > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

Re: [ceph-users] Large LOG like files on monitor

2015-10-08 Thread Christian Balzer
ave 3 or 5 MONs up and running. Christian > Regards, > Erwin > > > > Op 8 okt. 2015, om 09:57 heeft Christian Balzer het > > volgende geschreven: > > > > > > Hello, > > > > On Thu, 8 Oct 2015 09:38:02 +0200 Erwin Lubbers wrote: > >

Re: [ceph-users] Ceph OSD on ZFS

2015-10-14 Thread Christian Balzer
OSDs and a replication of 2. So that adding the additional node and rebuilding the old ones will actually only slightly decrease your OSD count (from assumed 8 to 6). > Also - should I put the monitor on ZFS as well? > leveldb and COW, also probably not so good. Christian > If this works

Re: [ceph-users] Does SSD Journal improve the performance?

2015-10-14 Thread Christian Balzer
with atop or the likes). However if the HW is identical in both pools your SSD may be one of those that perform abysmal with direct IO. There are plenty of threads in the ML archives about this topic. Christian > It's a big gap here, anyone can give me some suggestion here?

Re: [ceph-users] Cache Tiering Question

2015-10-15 Thread Christian Balzer
OewiKvsg4neDLqkdqaO6+bYuaDJmgN+hEqzl7lxbzt5pJbzfknpiAewm > >> GTw8C2AUbzcYqIhzqWcY9Jiy6ZZkYAPDODsJpkc/Pubnq73jlkllB4JaQpJy > >> 2964DynNn8jBAI9JJpLyldtKPEofmkumzZ6tPXgLDuo2VuV+hp/wVadZKy2k > >> PDhms1dpeLFM8NsgOToSpO6Ej1l1857C5+cy3EeTlKqgs6z1QbTwNvUeeCpk > >> /ORObJQCa7teNEM1c33oEJ3V1LOx

Re: [ceph-users] Minimum failure domain

2015-10-15 Thread Christian Balzer
es perfect sense. But, it got me wondering... > under what circumstances would one *not* consider a single node to be > the minimum failure domain for CRUSH purposes? > When you have a test cluster consisting just of node basically. Of course you would rather set the replication size to 1 in such a

Re: [ceph-users] why was osd pool default size changed from 2 to 3.

2015-10-24 Thread Christian Balzer
t" then a single PG will be active when the other > replica is under maintenance. > But if you "crush reweight to 0" before the maintenance this would not be > an issue. > Is this the main reason? > > From what I can gather even if you add new OSDs to the cluster and

Re: [ceph-users] Question about hardware and CPU selection

2015-10-25 Thread Christian Balzer
list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/

Re: [ceph-users] 2-Node Cluster - possible scenario?

2015-10-25 Thread Christian Balzer
Depending on what these VMs do and the amount of them, see my comments about performance. Christian > Any hints are appreciated! > > Best Regards, > Hermann > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusio

Re: [ceph-users] BAD nvme SSD performance

2015-10-26 Thread Christian Balzer
;rw,noatime,inode64,logbsize=256k,delaylog" > > > > filestore_xattr_use_omap = false > > > > filestore_max_inline_xattr_size = 512 > > > > filestore_max_sync_interval = 10 > > > > filestore_merge_threshold = 40 > > > > filestore_split_multiple = 8 > &g

Re: [ceph-users] BAD nvme SSD performance

2015-10-27 Thread Christian Balzer
igher and much more unpredictable. Regards, Christian > What do you think about it? > > Thanks > Regards, > Matteo > > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Somnath Roy Sent: lunedì 26 ottobre 2015

Re: [ceph-users] Choosing hp sata or sas SSDs for journals

2015-11-03 Thread Christian Balzer
be a good choice for denser nodes. Note that when looking at something similar I did choose 4 100GB DC S3700 over 2 200GB DC S3700 as the prices were nearly identical, the smaller SSDs gave me 800MB/s total instead of 730MB/s and with 8 HDDs per node I only would loose 2 OSDs in case of SSD failur

Re: [ceph-users] Choosing hp sata or sas SSDs for journals

2015-11-04 Thread Christian Balzer
Hello, On Wed, 4 Nov 2015 12:03:51 +0100 Karsten Heymann wrote: > Hi, > > 2015-11-04 6:55 GMT+01:00 Christian Balzer : > > On Tue, 3 Nov 2015 12:01:16 +0100 Karsten Heymann wrote: > >> has anyone experiences with hp-branded ssds for journaling? Given that > >&g

Re: [ceph-users] Choosing hp sata or sas SSDs for journals

2015-11-04 Thread Christian Balzer
On Wed, 4 Nov 2015 15:33:16 +0100 Karsten Heymann wrote: > Hi, > > 2015-11-04 15:16 GMT+01:00 Christian Balzer : > > On Wed, 4 Nov 2015 12:03:51 +0100 Karsten Heymann wrote: > >> I'm currently planning to use dl380 with 26 (24 at the front, two for > >> syst

Re: [ceph-users] Building a Pb EC cluster for a cheaper cold storage

2015-11-10 Thread Christian Balzer
ournal partition on the same disk > > We think that first and second problems it will be CPU and RAM on Ceph > servers. > > Any ideas? it is can fly? > > > > _______ > ceph-users mailing list > ceph-users@lists.ceph.co

Re: [ceph-users] Performance issues on small cluster

2015-11-10 Thread Christian Balzer
min_size = 1 # Allow writing n copy in a degraded state. > osd_pool_default_pg_num = 672 > osd_pool_default_pgp_num = 672 > osd_crush_chooseleaf_type = 1 > mon_osd_full_ratio = .75 > mon_osd_nearfull_ratio = .65 > osd_backfill_full_ratio = .65 > mon_clock_drift_allowed = .15 > mon_clock_

Re: [ceph-users] High disk utilisation

2015-11-29 Thread Christian Balzer
to do with the way objects are stored on the file > system? I remember reading that as the number of objects grow the files > on disk are re-orginised? > > This issue for obvious reasons causes a large degradation in > performance, is there a way of mitigating it? Will this g

Re: [ceph-users] High disk utilisation

2015-11-30 Thread Christian Balzer
n > client io 68363 kB/s wr, 1249 op/s > > > Cheers, > Bryn > > > On 30 Nov 2015, at 12:57, Christian Balzer > mailto:ch...@gol.com>> wrote: > > > Hello, > > On Mon, 30 Nov 2015 07:15:35 + MATHIAS, Bryn (Bryn) wrote: > > Hi All, &

Re: [ceph-users] how ceph mon works

2016-04-26 Thread Christian Balzer
> > > > thanks > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list &g

Re: [ceph-users] pgnum warning and decrease

2016-04-27 Thread Christian Balzer
h resources (CPU/RAM mostly) yes. > If I were to change it to decrease it to 1024, is this a safe way: > http://www.sebastien-han.fr/blog/2013/03/12/ceph-change-pg-number-on-the-fly/ > seems to make sense, but I don't have enough ceph experience (and guts) > to give it a go... >

Re: [ceph-users] Lab Newbie Here: Where do I start?

2016-05-02 Thread Christian Balzer
s they don't support it (especially RBD). In no particular order: OpenStack OpenNebula ganeti Qemu/KVM w/o any cluster manager (or Pacemaker as CRM) do support RBD. Also on which HW do you plan to run those VMs? Your 2 DL360s will probably be maxed out by running Ceph. Christian

Re: [ceph-users] OSD - Slow Requests

2016-05-04 Thread Christian Balzer
43_object9795 > [write 0~131072] 308.7e0944a ack+ondisk+write+known_if_redirected > e14815) currently waiting for subops from 84,97 2016-05-04 > 14:02:59.140562 osd.84 [WRN] 33 slow requests, 1 included below; oldest > blocked for > 58.267177 secs > > > -- Chr

Re: [ceph-users] Erasure pool performance expectations

2016-05-06 Thread Christian Balzer
te useful readforward and readproxy modes weren't either the last time I looked. But Nick mentioned them (and the confusion of their default values). Christian > > > > 3) The cache tier to fill up quickly when empty but change slowly once > > it's full (ie limiting

Re: [ceph-users] One osd crashing daily, the problem with osd.50

2016-05-09 Thread Christian Balzer
em, > rather then zapping the md and recreating from scratch. I was also > worrying if there was something fundamentaly wrong about running osd's > on software md raid5 devices. > No problem in and by itself, other than reduced performance. Regards, Christian -- Christian Balz

Re: [ceph-users] thanks for a double check on ceph's config

2016-05-10 Thread Christian Balzer
to waste on RBD cache? If so, bully for you, but you might find that depending on your use case a smaller RBD cache but more VM memory (for pagecache, SLAB, etc) could be more beneficial. > rbd_cache_max_dirty = 134217728 > rbd_cache_max_dirty_age = 5 Christia

Re: [ceph-users] journal or cache tier on SSDs ?

2016-05-10 Thread Christian Balzer
27;t it ? > If 24 nodes is the absolute limit of your cluster, you want to set the target pg num to 100 in the calculator, which gives you 8192 again. Keep in mind that splitting PGs is an expensive operation, so if 24 isn't a hard upper limit, you might be better off starting big. Chr

Re: [ceph-users] thanks for a double check on ceph's config

2016-05-10 Thread Christian Balzer
ing. > (3) osd_mount_options_xfs = > "rw,noexec,nodev,noatime,nodiratime,nobarrier" What's your suggested > options here? > As I said, loose the "nobarrier". Christian > Thanks a lot. > > > 2016-05-10 15:31 GMT+08:00 Christian Balzer : > > >

Re: [ceph-users] journal or cache tier on SSDs ?

2016-05-10 Thread Christian Balzer
che pool (and eventually to the > > HDDs, you can time that with lowering the dirty ratio during off-peak > > hours). > > I gonna give a look on that, thanks for the tips. > > >> We gonna use an EC pool for big files (jerasure 8+2 I think) and a > >> repl

Re: [ceph-users] Performance during disk rebuild - MercadoLibre

2016-05-10 Thread Christian Balzer
x27; > ceph tell osd.* injectargs '--osd-recovery-op-priority 1' > ceph tell osd.* injectargs '--osd-client-op-priority 63' > > The question is, there are more parameters to change in order to do more > gradually the OSD rebuild? > > I really appreciate yo

Re: [ceph-users] journal or cache tier on SSDs ?

2016-05-10 Thread Christian Balzer
nition DB (on Debian that can be done with "update-smart-drivedb"). Intel's calculation of the media wearout always seems to be very fuzzy to me, given your 7TB written I'd expect it to be 98%, at least 99%. But then again a 200GB DC S3700 of mine has written 90TB out of 3650TB t

Re: [ceph-users] journal or cache tier on SSDs ?

2016-05-10 Thread Christian Balzer
o avoid putting too many journals on one SSD, as a failure of the SSD will kill all associated HDD OSDs. However as you have 21 hosts and hopefully decent redundancy and distribution (CRUSH Map), going with 2 SSDs (6 journals per SSD) should be fine. Christian -- Christian BalzerNetwork/S

Re: [ceph-users] Weighted Priority Queue testing

2016-05-11 Thread Christian Balzer
when backfills and recovery settings are lowered. Regards, Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing list

Re: [ceph-users] thanks for a double check on ceph's config

2016-05-11 Thread Christian Balzer
nt_message_size_cap = 2147483648 > >> osd_deep_scrub_stride = 131072 > >> osd_op_threads = 8 > >> osd_disk_threads = 4 > >> osd_map_cache_size = 1024 > >> osd_map_cache_bl_size = 128 > >> osd_mount_options_xfs = "rw,noexec,nodev,n

Re: [ceph-users] rbd resize option

2016-05-11 Thread Christian Balzer
gt; >> > >> Thanks > >> Swami > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

Re: [ceph-users] journal or cache tier on SSDs ?

2016-05-11 Thread Christian Balzer
in 3500 OEM models) are very much unsuited for journal use. > But if I were you my choice would be between caching and moving them > to a non-ceph use. > A readforward or readonly cache-tier with very strict promotion rules is probably the best fit for

Re: [ceph-users] Weighted Priority Queue testing

2016-05-11 Thread Christian Balzer
hat a single PG/OSD can handle. Christian > Thanks & Regards > Somnath > > -Original Message- > From: Christian Balzer [mailto:ch...@gol.com] > Sent: Wednesday, May 11, 2016 12:31 AM > To: Somnath Roy > Cc: Mark Nelson; Nick Fisk; ceph-users@lists.ceph.com > S

Re: [ceph-users] about available space

2016-05-12 Thread Christian Balzer
ect block pool has 70% of space, each of the other > > pools has 10% of storage space. > > > > Thanks. > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.c

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Christian Balzer
e it's throttled but simply because there are so few PGs/OSDs to choose from. Or so it seems, purely from observation. Christian > On Wed, May 11, 2016 at 6:29 PM Christian Balzer wrote: > > > On Wed, 11 May 2016 16:10:06 + Somnath Roy wrote: > > > > > I bump

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Christian Balzer
uld expect 24 backfills. The prospective source OSDs aren't pegged with backfills either, they have 1-2 going on. I'm seriously wondering if this behavior is related to what we're talking about here. Christian > Thanks & Regards > Somnath > > -Original Message

Re: [ceph-users] Recommended OSD size

2016-05-13 Thread Christian Balzer
n. > The most important statement/question last. You will want to build a test cluster and verify that your application(s) are actually working well with CephFS, because if you read the ML there are cases when this may not be true. Christian -- Christian BalzerNetwork/Systems Engineer

Re: [ceph-users] Weighted Priority Queue testing

2016-05-13 Thread Christian Balzer
ighted priorities and buckets (prioritize the bucket of the OSD with the most target PGs). Regards, Christian > Regards > Somnath > > -----Original Message- > From: Christian Balzer [mailto:ch...@gol.com] > Sent: Thursday, May 12, 2016 11:52 PM > To: Somnath Roy > Cc: Sc

Re: [ceph-users] Steps for Adding Cache Tier

2016-05-13 Thread Christian Balzer
ght be missing here? Are there any other issues > that we might need to be aware of? I seem to recall some discussion on > the list with regard to settings that were required to make caching work > correctly, but my memory seems to indicate that these changes were > already added

Re: [ceph-users] Starting a cluster with one OSD node

2016-05-14 Thread Christian Balzer
g. > > Filesystem issue or kernel, but also as you add nodes the data > > movement will introduce a good deal of overhead. > > > > Regards, > > Alex > > > > > >> > >> Cheers, > >> Mike > >> > >> > &g

Re: [ceph-users] Erasure pool performance expectations

2016-05-16 Thread Christian Balzer
> > > > > > > > > > > > I confirmed the settings are indeed correctly picked up across the > > nodes in > > > > the cluster. > > > > > > Good, glad we got that sorted > > > > >

Re: [ceph-users] Erasure pool performance expectations

2016-05-16 Thread Christian Balzer
is indeed one of the reasons. The other reason was that I thought > that by removing dirty objects I didn't need replication on the cache > tier, which I'm now starting to doubt again... You absolutely want your cache tier to have sufficient replication. 2 at the very lea

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Christian Balzer
; > > pg_num is the actual amount of PGs. This you can increase without any > actual data moving. > Yes and no. Increasing the pg_num will split PGs, which causes potentially massive I/O. Also AFAIK that I/O isn't regulated by the various recovery and backfill parameters.

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Christian Balzer
Hello, On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote: > On Tue, May 17, 2016 at 08:21:48AM +0900, Christian Balzer wrote: > > On Mon, 16 May 2016 22:40:47 +0200 (CEST) Wido den Hollander wrote: > > > > > > pg_num is the actual amount of PGs. This y

Re: [ceph-users] Increasing pg_num

2016-05-16 Thread Christian Balzer
Hello, On Tue, 17 May 2016 12:12:02 +1000 Chris Dunlop wrote: > Hi Christian, > > On Tue, May 17, 2016 at 10:41:52AM +0900, Christian Balzer wrote: > > On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote: > > Most your questions would be easily answered if you did sp

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread Christian Balzer
> http://docs.ceph.com/docs/master/_downloads/v0.94.6.txt > >> > > >> > Getting Ceph > >> > > >> > > >> > * Git at git://github.com/ceph/ceph.git > >> > * Tarball at http://download.ceph.com/tarballs/ceph-0.94.7.tar.gz >

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
out of sockets # Also increase the max packet backlog net.core.somaxconn = 1024 net.core.netdev_max_backlog = 5 net.ipv4.tcp_max_syn_backlog = 3 net.ipv4.tcp_max_tw_buckets = 200 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_fin_timeout = 10 # Disable TCP slow start on idle connections net.ipv4.

Re: [ceph-users] OSD node memory sizing

2016-05-18 Thread Christian Balzer
is a requirement in your use case. Christian > Thanks for any comment > Dietmar > > [1] http://docs.ceph.com/docs/jewel/start/hardware-recommendations/ > [2] > https://www.redhat.com/en/files/resources/en-rhst-cephstorage-supermicro-INC0270868_v2_0715.pdf > -- Ch

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
nough RAM to keep all your important bits in memory can be a game changer. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-u

Re: [ceph-users] OSD node memory sizing

2016-05-18 Thread Christian Balzer
e it faster by itself AND enable it to use more CPU resources as well. The NVMes (DC P3700 one presumes?) just for cache tiering, no SSD journals for the OSDs? What are your network plans then, as in is your node storage bandwidth a good match for your network bandwidth? > > That is

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Note that if your total NVMe write bandwidth is more than the total > > disk bandwidth they act as buffers capable of handling short write > > bursts (only if there's no read on recent writes which should almost > > never happen for RBD but might for other uses) so yo

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
> PG count to the next power of 2). We also set vfs_cache_pressure to 1, > > though this didn't really seem to do much at the time. I've also seen > > recommendations about setting min_free_kbytes to something higher > > (currently 90112 on our hardware) but have not verified this. > > >

Re: [ceph-users] mark out vs crush weight 0

2016-05-18 Thread Christian Balzer
likely make > the normal osd startup crush location update do so with the OSDs > advertised capacity). Is it sensible? > > And/or, anybody have a good idea how the tools can/should be changed to > make the osd replacement re-use the osd id? > > sage > > >

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Hello Kris, On Wed, 18 May 2016 19:31:49 -0700 Kris Jurka wrote: > > > On 5/18/2016 7:15 PM, Christian Balzer wrote: > > >> We have hit the following issues: > >> > >> - Filestore merge splits occur at ~40 MObjects with default > >> setti

Re: [ceph-users] OSD node memory sizing

2016-05-19 Thread Christian Balzer
Hello, On Thu, 19 May 2016 10:51:20 +0200 Dietmar Rieder wrote: > Hello, > > On 05/19/2016 03:36 AM, Christian Balzer wrote: > > > > Hello again, > > > > On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote: > > > >> Hello Christian, >

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Christian Balzer
om/listinfo.cgi/ceph-users-ceph.com > > > > _______ > ceph-users mailing list > ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian BalzerNetwork/Sys

Re: [ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread Christian Balzer
nded recipient, you are on notice that any > distribution of this message, in any form, is strictly prohibited. If > you have received this message in error, please immediately notify the > sender and delete or destroy any copy of this message! -- Christi

Re: [ceph-users] mark out vs crush weight 0

2016-05-19 Thread Christian Balzer
hile its moving, there is a true performance hit on the virtual servers. > > So if this could be solved, by a IOPS/HDD Bandwidth rate limit, that i > can simply tell the cluster to use max. 10 IOPS and/or 10 MB/s for the > recovery, then i

Re: [ceph-users] Do you see a data loss if a SSD hosting several OSD journals crashes

2016-05-19 Thread Christian Balzer
ng be very small if you choose the right type of SSD, Intel DC 37xx or at least 36xx for example. Christian > - epk > > -----Original Message- > From: Christian Balzer [mailto:ch...@gol.com] > Sent: Thursday, May 19, 2016 7:00 PM > To: ceph-users@lists.ceph.com > Cc: EP Komar

Re: [ceph-users] OSDs automount all devices on a san

2016-05-20 Thread Christian Balzer
there a way within ceph to tell a particular OSS to ignore OSDs that > aren't meant for it? It's odd to me that a mere partprobe causes the OSD > to mount even. > > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 8

Re: [ceph-users] Cant remove ceph filesystem

2016-05-20 Thread Christian Balzer
960.html That said, this is very poorly documented, like other CephFS bits as well when it comes to manual deployment. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten

Re: [ceph-users] free krbd size in ubuntu12.04 in ceph 0.67.9

2016-05-20 Thread Christian Balzer
d/or with librbd and fuse and then run fstrim. However trim is a pretty costly activity in Ceph, so it may a) impact your cluster performance and b) take a while, depending on how much data we're talking about. Lastly, while having a sparse storage serice like Ceph is very nice I always try to ha

Re: [ceph-users] dense storage nodes

2016-05-20 Thread Christian Balzer
uch better fit for you. Regards, Christian > regards, > Ben > > On Wed, May 18, 2016 at 10:01 PM, Christian Balzer wrote: > > > > Hello, > > > > On Wed, 18 May 2016 12:32:25 -0400 Benjeman Meekhof wrote: > > > >> Hi Lionel, > >> > >

Re: [ceph-users] Diagnosing slow requests

2016-05-23 Thread Christian Balzer
cs/master/rados/troubleshooting/troubleshooting-osd/ > > to hopefully find out what's happening but as far as the hardware is > > concerned everything looks fine. No smart errors logged, iostats shows > > some activity but nothing pegged to 100%, no messages in dmesg and t

Re: [ceph-users] Public and Private network over 1 interface

2016-05-23 Thread Christian Balzer
rformance perspective, has anybody observed a > > >> significant performance hit by untagging vlans on the node? This is > > >> something I > > can't > > >> test, since I don't currently own 40 gig gear. > > >> > > >> 3.a)

Re: [ceph-users] NVRAM cards as OSD journals

2016-05-23 Thread Christian Balzer
distribution of this message, in any form, is strictly prohibited. If > you have received this message in error, please immediately notify the > sender and delete or destroy any copy of this message! -- Christian BalzerNetwork/Systems Engineer ch...@gol.com

Re: [ceph-users] dense storage nodes

2016-05-23 Thread Christian Balzer
nodes — pinning OSD processes, HBA/NIC interrupts etc. > to cores/sockets to limit data sent over QPI links on NUMA > architectures. It’s easy to believe that modern inter-die links are > Fast Enough For You Old Man but there’s more too it. > Ayup, very much so. Christian -- Chr

Re: [ceph-users] Falls cluster then one node switch off

2016-05-23 Thread Christian Balzer
_ruleset { > ruleset 0 > type replicated > min_size 1 > max_size 10 > step take default > step chooseleaf firstn 0 type host > step emit > } > # end crush map > > _

Re: [ceph-users] Diagnosing slow requests

2016-05-24 Thread Christian Balzer
sing the pool in read-forward now so there should be > > almost no promotion from EC to the SSD pool. I will see what options I > > have for adding some SSD journals to the OSD nodes to help speed > > things along. > > > > Thanks, and apologies again for missing your e

Re: [ceph-users] Blocked ops, OSD consuming memory, hammer

2016-05-24 Thread Christian Balzer
7;s just beyond odd. As for Heath, we do indeed need more data as in: a) How busy are your HDD nodes? (atop, iostat). Any particular HDDs/OSDs standing out, as in being slower/busier for a prolonged time? b) No SSD journals for the spinners right? c) The memory exhaustion is purely caused by the

Re: [ceph-users] SSD randwrite performance

2016-05-24 Thread Christian Balzer
onfig for fio. > > I am confused because EMC ScaleIO can do much more iops what is boring > my boss :) > There are lot of discussion and slides on how to improve/maximize IOPS with Ceph, go search for them. Fast CPUs, jmalloc, pinning, configuration, NVMes for journals, etc. Chris

Re: [ceph-users] Falls cluster then one node switch off

2016-05-24 Thread Christian Balzer
lean > 262 pgs stuck undersized > 408 pgs undersized > recovery 315/1098 objects degraded (28.689%) > recovery 234/1098 objects misplaced (21.311%) > 1 mons down, quorum 0,2 ceph1-node,ceph-mon2 > monmap e1: 3 mons at &g

Re: [ceph-users] Ceph crash, how to analyse and recover

2016-05-25 Thread Christian Balzer
t notify the sender > immediately by return e-mail. University Medical Center Utrecht is a > legal person by public law and is registered at the Chamber of Commerce > for Midden-Nederland under no. 30244197. > > Please consider the environment before printing this e-mail. -- Christia

<    1   2   3   4   5   6   7   8   9   10   >