gle Intel SSD, DC or otherwise.
Christian
> Jan
>
>
> > On 07 Sep 2015, at 05:53, Christian Balzer wrote:
> >
> > On Sat, 5 Sep 2015 07:13:29 -0300 German Anders wrote:
> >
> >> Hi Christian,
> >>
> >>Ok so would said that it'
recommended that barriers are turned off as the drive has a
> > safe cache (I am confident that the cache will write out to disk on
> > power failure)?
> >
> > Has anyone else encountered this issue?
> >
> > Any info or sug
ort: SUCCESS
scmd(880fdc85b680)
---
Note that on the un-patched node (DRBD replication target) I managed to
trigger this bug 3 times in the same period.
So unless Intel has something to say (and given that this happens with
Samsungs as well), I'd still look beady eyed at LSI/Avago...
Chri
pe.com
>
> wsFcBAEBCAAQBQJV8gEnCRDmVDuy+mK58QAAQ7QQAJjm1tu9Tp8q+TPXS6k/
> +MXfpW28p54y67gfBcGHSOJd/VzJsIytFeO9Q5r6uA3U+JFvxVeN8Jpbp8qF
> JyjAR2qttW5MnOcZm8Zf8VI6RVNfCXw9KIqCtO8ZWN89JKNg0ImXqMKOK5rL
> wg1wuk+fFF8PvJlweQS9xOFdXgxfnMXlLfXoYccHzRsRyTHIixrVED1vWgAA
> oLSOYySPaLTjJLfa
16=3.2%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> issued: total=r=0/w=102400/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%,
> depth=16
>
> Run status group 0 (all jobs):
> WRITE: io=102400MB, aggrb=121882KB/s, minb=121882KB/s, m
art End SizeFile system Name Flags
> 1 1049kB 211MB 210MB ext4 boot
> 2 211MB 21.2GB 21.0GB ext4
> 3 21.2GB 29.6GB 8389MB linux-swap(v1)
>
> There is enough space for many 5G journal patitions on sda
--
Chri
sy?
And unless you deploy like 10 of them initially, a node of that size going
down will severely impact your cluster performance.
>
> So which rule should we considered that can stand true for a 12 OSD node
> and even for 72 OSD node.
2GB per OSD plus OS/other needs, round up to what
plan to update the
> firmware on the remainder of the S3710 drives this week and also set
> nobarriers.
>
> Regards,
>
> Richard
>
>
>
> On 8 September 2015 at 14:27, Richard Bade <mailto:hitr...@gmail.com> > wrote:
>
> Hi Christian,
>
>
>
&
osd.11 0.140 root=ssd”
> >>
> >> I’m able to verify that the OSD / MON host and another MON I have
> >> running see the same CRUSH map.
> >>
> >> After rebooting OSD / MON host, both osd.10 and osd.11 become part of
> >> the default bucket. How can I
d the
> > partition again automatically without reconfiguration)
> > - start the OSD
> >
> > If you script this you should not have to use noout: the OSD should
> > come back in a matter of seconds and the impact on the storage network
> > minimal.
> >
> > Not
?
> >
> > Thanks for any help,
> >
> > Olivier
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
which, what FS are you using on your OSDs?
>
>
> Le vendredi 18 septembre 2015 à 12:30 +0900, Christian Balzer a écrit :
> > Hello,
> >
> > On Fri, 18 Sep 2015 02:43:49 +0200 Olivier Bonvalet wrote:
> >
> > The items below help, but be a s specific as pos
Hello,
On Fri, 18 Sep 2015 10:35:37 +0200 Olivier Bonvalet wrote:
> Le vendredi 18 septembre 2015 à 17:04 +0900, Christian Balzer a écrit :
> > Hello,
> >
> > On Fri, 18 Sep 2015 09:37:24 +0200 Olivier Bonvalet wrote:
> >
> > > Hi,
> > >
> &g
; > Could this be caused by monitors? In my case lagging monitors can
> > > > also cause slow requests (because of slow peering). Not sure if
> > > > that's expected or not, but it of course doesn't show on the OSDs
> > > > as
> > > >
0119 Berlin
> >
> > http://www.heinlein-support.de
> >
> > Tel: 030 / 405051-43
> > Fax: 030 / 405051-19
> >
> > Zwangsangaben lt. §35a GmbHG:
> > HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> > Geschäftsführer: Peer Heinlein -- Sitz: Berlin
&g
if you're reading just 8GB in your tests and that fits nicely in
the page caches of the OSDs, it will be wire speed.
>Should I configure a replica factor of 3?
>
If you value your data, which you will on a production server, then yes.
This will of course cost you 1/3 of your
ive you a concrete example, on my test cluster I have 5 nodes, 4
> > HDDs/OSDs each and no journal SSDs.
> > So that's in theory 100 IOPS per HDD, divided by 2 for the on-disk
> > journal, divided by 3 for replication:
> > 20*100/2/3=333
> > Which amazingly is what
ta set fits into the page
> > > caches of your storage nodes, it will be fast, if everything needs
> > > to be read from the HDDs, you're back to what these devices can do
> > > (~100 IOPS per HDD).
> > >
> > > To give you a concrete example, on my te
slow WRITES that really upset the VMs and the
application they run.
Clearly what I'm worried about here is that the old pool
backfilling/recovering will be quite comatose (as mentioned above) during
that time.
Regards,
Christian
--
Christian BalzerNetwork/Systems Engineer
mon addr = 192.168.1.153:6789
>
>
> [osd]
>
>
> [osd.0]
> host = storageOne
>
>
> [osd.1]
> host = storageTwo
>
>
> [osd.2]
> host = storageFour
>
>
> [osd.3]
> host = storageLast
>
ones
will be replaced eventually.
Christian
[snip]
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.cep
Hello Udo,
On Wed, 07 Oct 2015 11:40:11 +0200 Udo Lembke wrote:
> Hi Christian,
>
> On 07.10.2015 09:04, Christian Balzer wrote:
> >
> > ...
> >
> > My main suspect for the excessive slowness are actually the Toshiba DT
> > type drives used.
> >
ian
> Ceph stucks in creating the pgs forever. Those pgs are stuck in inactive
> and unclean. And the Ceph pg query hangs forever. I googled this problem
> and didn't get a clue. Is there anything I missed?
> Any idea to help me?
>
>
> --
>
> Zhen Wang
>
>
d anything.
Christian
> I have four storage nodes. Each of them has two independent hard drive
> to store data. One is 120GB SSD, and the other is 1TB HDD. I set the
> weight of SSD is 0.1 and weight of HDD is 1.0.
>
>
>
>
>
> --
>
> Zhen Wang
> Shanghai Jia
e date of
LOG is current) it obviously isn't safe to remove it.
Christian
> Regards,
> Erwin
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
ave 3 or 5 MONs up and running.
Christian
> Regards,
> Erwin
>
>
> > Op 8 okt. 2015, om 09:57 heeft Christian Balzer het
> > volgende geschreven:
> >
> >
> > Hello,
> >
> > On Thu, 8 Oct 2015 09:38:02 +0200 Erwin Lubbers wrote:
> >
OSDs and a
replication of 2.
So that adding the additional node and rebuilding the old ones will
actually only slightly decrease your OSD count (from assumed 8 to 6).
> Also - should I put the monitor on ZFS as well?
>
leveldb and COW, also probably not so good.
Christian
> If this works
with atop or the likes).
However if the HW is identical in both pools your SSD may be one of those
that perform abysmal with direct IO.
There are plenty of threads in the ML archives about this topic.
Christian
> It's a big gap here, anyone can give me some suggestion here?
OewiKvsg4neDLqkdqaO6+bYuaDJmgN+hEqzl7lxbzt5pJbzfknpiAewm
> >> GTw8C2AUbzcYqIhzqWcY9Jiy6ZZkYAPDODsJpkc/Pubnq73jlkllB4JaQpJy
> >> 2964DynNn8jBAI9JJpLyldtKPEofmkumzZ6tPXgLDuo2VuV+hp/wVadZKy2k
> >> PDhms1dpeLFM8NsgOToSpO6Ej1l1857C5+cy3EeTlKqgs6z1QbTwNvUeeCpk
> >> /ORObJQCa7teNEM1c33oEJ3V1LOx
es perfect sense. But, it got me wondering...
> under what circumstances would one *not* consider a single node to be
> the minimum failure domain for CRUSH purposes?
>
When you have a test cluster consisting just of node basically.
Of course you would rather set the replication size to 1 in such a
t" then a single PG will be active when the other
> replica is under maintenance.
> But if you "crush reweight to 0" before the maintenance this would not be
> an issue.
> Is this the main reason?
>
> From what I can gather even if you add new OSDs to the cluster and
list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Fusion Communications
http://www.gol.com/
Depending on what these VMs do and the amount of them, see my comments
about performance.
Christian
> Any hints are appreciated!
>
> Best Regards,
> Hermann
>
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Fusio
;rw,noatime,inode64,logbsize=256k,delaylog"
> >
> > filestore_xattr_use_omap = false
> >
> > filestore_max_inline_xattr_size = 512
> >
> > filestore_max_sync_interval = 10
> >
> > filestore_merge_threshold = 40
> >
> > filestore_split_multiple = 8
> &g
igher and much more unpredictable.
Regards,
Christian
> What do you think about it?
>
> Thanks
> Regards,
> Matteo
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Somnath Roy Sent: lunedì 26 ottobre 2015
be a
good choice for denser nodes.
Note that when looking at something similar I did choose 4 100GB DC S3700
over 2 200GB DC S3700 as the prices were nearly identical, the smaller
SSDs gave me 800MB/s total instead of 730MB/s and with 8 HDDs per node I
only would loose 2 OSDs in case of SSD failur
Hello,
On Wed, 4 Nov 2015 12:03:51 +0100 Karsten Heymann wrote:
> Hi,
>
> 2015-11-04 6:55 GMT+01:00 Christian Balzer :
> > On Tue, 3 Nov 2015 12:01:16 +0100 Karsten Heymann wrote:
> >> has anyone experiences with hp-branded ssds for journaling? Given that
> >&g
On Wed, 4 Nov 2015 15:33:16 +0100 Karsten Heymann wrote:
> Hi,
>
> 2015-11-04 15:16 GMT+01:00 Christian Balzer :
> > On Wed, 4 Nov 2015 12:03:51 +0100 Karsten Heymann wrote:
> >> I'm currently planning to use dl380 with 26 (24 at the front, two for
> >> syst
ournal partition on the same disk
>
> We think that first and second problems it will be CPU and RAM on Ceph
> servers.
>
> Any ideas? it is can fly?
>
>
>
> _______
> ceph-users mailing list
> ceph-users@lists.ceph.co
min_size = 1 # Allow writing n copy in a degraded state.
> osd_pool_default_pg_num = 672
> osd_pool_default_pgp_num = 672
> osd_crush_chooseleaf_type = 1
> mon_osd_full_ratio = .75
> mon_osd_nearfull_ratio = .65
> osd_backfill_full_ratio = .65
> mon_clock_drift_allowed = .15
> mon_clock_
to do with the way objects are stored on the file
> system? I remember reading that as the number of objects grow the files
> on disk are re-orginised?
>
> This issue for obvious reasons causes a large degradation in
> performance, is there a way of mitigating it? Will this g
n
> client io 68363 kB/s wr, 1249 op/s
>
>
> Cheers,
> Bryn
>
>
> On 30 Nov 2015, at 12:57, Christian Balzer
> mailto:ch...@gol.com>> wrote:
>
>
> Hello,
>
> On Mon, 30 Nov 2015 07:15:35 + MATHIAS, Bryn (Bryn) wrote:
>
> Hi All,
&
> >
> > thanks
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
&g
h resources (CPU/RAM mostly) yes.
> If I were to change it to decrease it to 1024, is this a safe way:
> http://www.sebastien-han.fr/blog/2013/03/12/ceph-change-pg-number-on-the-fly/
> seems to make sense, but I don't have enough ceph experience (and guts)
> to give it a go...
>
s they don't
support it (especially RBD).
In no particular order:
OpenStack
OpenNebula
ganeti
Qemu/KVM w/o any cluster manager (or Pacemaker as CRM)
do support RBD.
Also on which HW do you plan to run those VMs?
Your 2 DL360s will probably be maxed out by running Ceph.
Christian
43_object9795
> [write 0~131072] 308.7e0944a ack+ondisk+write+known_if_redirected
> e14815) currently waiting for subops from 84,97 2016-05-04
> 14:02:59.140562 osd.84 [WRN] 33 slow requests, 1 included below; oldest
> blocked for > 58.267177 secs
>
>
>
--
Chr
te useful readforward and readproxy modes
weren't either the last time I looked.
But Nick mentioned them (and the confusion of their default values).
Christian
> >
> > 3) The cache tier to fill up quickly when empty but change slowly once
> > it's full (ie limiting
em,
> rather then zapping the md and recreating from scratch. I was also
> worrying if there was something fundamentaly wrong about running osd's
> on software md raid5 devices.
>
No problem in and by itself, other than reduced performance.
Regards,
Christian
--
Christian Balz
to waste on RBD cache?
If so, bully for you, but you might find that depending on your use case a
smaller RBD cache but more VM memory (for pagecache, SLAB, etc) could be
more beneficial.
> rbd_cache_max_dirty = 134217728
> rbd_cache_max_dirty_age = 5
Christia
27;t it ?
>
If 24 nodes is the absolute limit of your cluster, you want to set the
target pg num to 100 in the calculator, which gives you 8192 again.
Keep in mind that splitting PGs is an expensive operation, so if 24 isn't
a hard upper limit, you might be better off starting big.
Chr
ing.
> (3) osd_mount_options_xfs =
> "rw,noexec,nodev,noatime,nodiratime,nobarrier" What's your suggested
> options here?
>
As I said, loose the "nobarrier".
Christian
> Thanks a lot.
>
>
> 2016-05-10 15:31 GMT+08:00 Christian Balzer :
>
> >
che pool (and eventually to the
> > HDDs, you can time that with lowering the dirty ratio during off-peak
> > hours).
>
> I gonna give a look on that, thanks for the tips.
>
> >> We gonna use an EC pool for big files (jerasure 8+2 I think) and a
> >> repl
x27;
> ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
> ceph tell osd.* injectargs '--osd-client-op-priority 63'
>
> The question is, there are more parameters to change in order to do more
> gradually the OSD rebuild?
>
> I really appreciate yo
nition DB (on Debian
that can be done with "update-smart-drivedb").
Intel's calculation of the media wearout always seems to be very fuzzy to
me, given your 7TB written I'd expect it to be 98%, at least 99%.
But then again a 200GB DC S3700 of mine has written 90TB out of 3650TB
t
o avoid putting too many journals on one SSD,
as a failure of the SSD will kill all associated HDD OSDs.
However as you have 21 hosts and hopefully decent redundancy and
distribution (CRUSH Map), going with 2 SSDs (6 journals per SSD) should be
fine.
Christian
--
Christian BalzerNetwork/S
when backfills and recovery settings are
lowered.
Regards,
Christian
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
nt_message_size_cap = 2147483648
> >> osd_deep_scrub_stride = 131072
> >> osd_op_threads = 8
> >> osd_disk_threads = 4
> >> osd_map_cache_size = 1024
> >> osd_map_cache_bl_size = 128
> >> osd_mount_options_xfs = "rw,noexec,nodev,n
gt; >>
> >> Thanks
> >> Swami
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
in 3500 OEM models) are very
much unsuited for journal use.
> But if I were you my choice would be between caching and moving them
> to a non-ceph use.
>
A readforward or readonly cache-tier with very strict promotion rules is
probably the best fit for
hat a single
PG/OSD can handle.
Christian
> Thanks & Regards
> Somnath
>
> -Original Message-
> From: Christian Balzer [mailto:ch...@gol.com]
> Sent: Wednesday, May 11, 2016 12:31 AM
> To: Somnath Roy
> Cc: Mark Nelson; Nick Fisk; ceph-users@lists.ceph.com
> S
ect block pool has 70% of space, each of the other
> > pools has 10% of storage space.
> >
> > Thanks.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.c
e it's throttled
but simply because there are so few PGs/OSDs to choose from.
Or so it seems, purely from observation.
Christian
> On Wed, May 11, 2016 at 6:29 PM Christian Balzer wrote:
>
> > On Wed, 11 May 2016 16:10:06 + Somnath Roy wrote:
> >
> > > I bump
uld expect 24 backfills.
The prospective source OSDs aren't pegged with backfills either, they have
1-2 going on.
I'm seriously wondering if this behavior is related to what we're talking
about here.
Christian
> Thanks & Regards
> Somnath
>
> -Original Message
n.
>
The most important statement/question last.
You will want to build a test cluster and verify that your application(s)
are actually working well with CephFS, because if you read the ML there
are cases when this may not be true.
Christian
--
Christian BalzerNetwork/Systems Engineer
ighted
priorities and buckets (prioritize the bucket of the OSD with the most
target PGs).
Regards,
Christian
> Regards
> Somnath
>
> -----Original Message-
> From: Christian Balzer [mailto:ch...@gol.com]
> Sent: Thursday, May 12, 2016 11:52 PM
> To: Somnath Roy
> Cc: Sc
ght be missing here? Are there any other issues
> that we might need to be aware of? I seem to recall some discussion on
> the list with regard to settings that were required to make caching work
> correctly, but my memory seems to indicate that these changes were
> already added
g.
> > Filesystem issue or kernel, but also as you add nodes the data
> > movement will introduce a good deal of overhead.
> >
> > Regards,
> > Alex
> >
> >
> >>
> >> Cheers,
> >> Mike
> >>
> >>
> &g
>
> > >
> > > >
> > > > I confirmed the settings are indeed correctly picked up across the
> > nodes in
> > > > the cluster.
> > >
> > > Good, glad we got that sorted
> > >
> >
is indeed one of the reasons. The other reason was that I thought
> that by removing dirty objects I didn't need replication on the cache
> tier, which I'm now starting to doubt again...
You absolutely want your cache tier to have sufficient replication.
2 at the very lea
;
>
> pg_num is the actual amount of PGs. This you can increase without any
> actual data moving.
>
Yes and no.
Increasing the pg_num will split PGs, which causes potentially massive I/O.
Also AFAIK that I/O isn't regulated by the various recovery and backfill
parameters.
Hello,
On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote:
> On Tue, May 17, 2016 at 08:21:48AM +0900, Christian Balzer wrote:
> > On Mon, 16 May 2016 22:40:47 +0200 (CEST) Wido den Hollander wrote:
> > >
> > > pg_num is the actual amount of PGs. This y
Hello,
On Tue, 17 May 2016 12:12:02 +1000 Chris Dunlop wrote:
> Hi Christian,
>
> On Tue, May 17, 2016 at 10:41:52AM +0900, Christian Balzer wrote:
> > On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote:
> > Most your questions would be easily answered if you did sp
> http://docs.ceph.com/docs/master/_downloads/v0.94.6.txt
> >> >
> >> > Getting Ceph
> >> >
> >> >
> >> > * Git at git://github.com/ceph/ceph.git
> >> > * Tarball at http://download.ceph.com/tarballs/ceph-0.94.7.tar.gz
>
out of sockets
# Also increase the max packet backlog
net.core.somaxconn = 1024
net.core.netdev_max_backlog = 5
net.ipv4.tcp_max_syn_backlog = 3
net.ipv4.tcp_max_tw_buckets = 200
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
# Disable TCP slow start on idle connections
net.ipv4.
is a requirement in your use case.
Christian
> Thanks for any comment
> Dietmar
>
> [1] http://docs.ceph.com/docs/jewel/start/hardware-recommendations/
> [2]
> https://www.redhat.com/en/files/resources/en-rhst-cephstorage-supermicro-INC0270868_v2_0715.pdf
>
--
Ch
nough RAM to keep all
your important bits in memory can be a game changer.
Christian
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-u
e it faster by itself AND enable it to
use more CPU resources as well.
The NVMes (DC P3700 one presumes?) just for cache tiering, no SSD
journals for the OSDs?
What are your network plans then, as in is your node storage bandwidth a
good match for your network bandwidth?
> > That is
Note that if your total NVMe write bandwidth is more than the total
> > disk bandwidth they act as buffers capable of handling short write
> > bursts (only if there's no read on recent writes which should almost
> > never happen for RBD but might for other uses) so yo
> PG count to the next power of 2). We also set vfs_cache_pressure to 1,
> > though this didn't really seem to do much at the time. I've also seen
> > recommendations about setting min_free_kbytes to something higher
> > (currently 90112 on our hardware) but have not verified this.
> >
>
likely make
> the normal osd startup crush location update do so with the OSDs
> advertised capacity). Is it sensible?
>
> And/or, anybody have a good idea how the tools can/should be changed to
> make the osd replacement re-use the osd id?
>
> sage
>
>
>
Hello Kris,
On Wed, 18 May 2016 19:31:49 -0700 Kris Jurka wrote:
>
>
> On 5/18/2016 7:15 PM, Christian Balzer wrote:
>
> >> We have hit the following issues:
> >>
> >> - Filestore merge splits occur at ~40 MObjects with default
> >> setti
Hello,
On Thu, 19 May 2016 10:51:20 +0200 Dietmar Rieder wrote:
> Hello,
>
> On 05/19/2016 03:36 AM, Christian Balzer wrote:
> >
> > Hello again,
> >
> > On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote:
> >
> >> Hello Christian,
>
om/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______
> ceph-users mailing list
> ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Christian BalzerNetwork/Sys
nded recipient, you are on notice that any
> distribution of this message, in any form, is strictly prohibited. If
> you have received this message in error, please immediately notify the
> sender and delete or destroy any copy of this message!
--
Christi
hile its moving, there is a true performance hit on the virtual servers.
>
> So if this could be solved, by a IOPS/HDD Bandwidth rate limit, that i
> can simply tell the cluster to use max. 10 IOPS and/or 10 MB/s for the
> recovery, then i
ng be
very small if you choose the right type of SSD, Intel DC 37xx or at least
36xx for example.
Christian
> - epk
>
> -----Original Message-
> From: Christian Balzer [mailto:ch...@gol.com]
> Sent: Thursday, May 19, 2016 7:00 PM
> To: ceph-users@lists.ceph.com
> Cc: EP Komar
there a way within ceph to tell a particular OSS to ignore OSDs that
> aren't meant for it? It's odd to me that a mere partprobe causes the OSD
> to mount even.
>
>
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voice: 8
960.html
That said, this is very poorly documented, like other CephFS bits as well
when it comes to manual deployment.
Christian
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Rakuten
d/or with librbd and fuse and then run fstrim.
However trim is a pretty costly activity in Ceph, so it may
a) impact your cluster performance and
b) take a while, depending on how much data we're talking about.
Lastly, while having a sparse storage serice like Ceph is very nice I
always try to ha
uch better
fit for you.
Regards,
Christian
> regards,
> Ben
>
> On Wed, May 18, 2016 at 10:01 PM, Christian Balzer wrote:
> >
> > Hello,
> >
> > On Wed, 18 May 2016 12:32:25 -0400 Benjeman Meekhof wrote:
> >
> >> Hi Lionel,
> >>
> >
cs/master/rados/troubleshooting/troubleshooting-osd/
> > to hopefully find out what's happening but as far as the hardware is
> > concerned everything looks fine. No smart errors logged, iostats shows
> > some activity but nothing pegged to 100%, no messages in dmesg and t
rformance perspective, has anybody observed a
> > >> significant performance hit by untagging vlans on the node? This is
> > >> something I
> > can't
> > >> test, since I don't currently own 40 gig gear.
> > >>
> > >> 3.a)
distribution of this message, in any form, is strictly prohibited. If
> you have received this message in error, please immediately notify the
> sender and delete or destroy any copy of this message!
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com
nodes — pinning OSD processes, HBA/NIC interrupts etc.
> to cores/sockets to limit data sent over QPI links on NUMA
> architectures. It’s easy to believe that modern inter-die links are
> Fast Enough For You Old Man but there’s more too it.
>
Ayup, very much so.
Christian
--
Chr
_ruleset {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
> # end crush map
>
> _
sing the pool in read-forward now so there should be
> > almost no promotion from EC to the SSD pool. I will see what options I
> > have for adding some SSD journals to the OSD nodes to help speed
> > things along.
> >
> > Thanks, and apologies again for missing your e
7;s just beyond odd.
As for Heath, we do indeed need more data as in:
a) How busy are your HDD nodes? (atop, iostat). Any particular HDDs/OSDs
standing out, as in being slower/busier for a prolonged time?
b) No SSD journals for the spinners right?
c) The memory exhaustion is purely caused by the
onfig for fio.
>
> I am confused because EMC ScaleIO can do much more iops what is boring
> my boss :)
>
There are lot of discussion and slides on how to improve/maximize IOPS
with Ceph, go search for them.
Fast CPUs, jmalloc, pinning, configuration, NVMes for journals, etc.
Chris
lean
> 262 pgs stuck undersized
> 408 pgs undersized
> recovery 315/1098 objects degraded (28.689%)
> recovery 234/1098 objects misplaced (21.311%)
> 1 mons down, quorum 0,2 ceph1-node,ceph-mon2
> monmap e1: 3 mons at
&g
t notify the sender
> immediately by return e-mail. University Medical Center Utrecht is a
> legal person by public law and is registered at the Chamber of Commerce
> for Midden-Nederland under no. 30244197.
>
> Please consider the environment before printing this e-mail.
--
Christia
501 - 600 of 1226 matches
Mail list logo