> Honestly, we are scared to try the same tests with the m600s. When we
> first put them in, we had them more full, but we backed them off to
> reduce the load on them.
I see.
Did you tune anything on linux layer like:
vm.vfs_cache_pressure
It may not be necessary to mention specifically since
Thank you for your response!
All my hosts have raid cards. Some raid cards are in pass-throughput mode,
and the others are in write-back mode. I will set all raid cards
pass-throughput mode and observe for a period of time.
Best Regards
sunspot
2016-02-25 20:07 GMT+08:00 Ferhat Ozkasgarli :
>
I've done a bit of testing with the Intel units: S3600, S3700, S3710, and
P3700. I've also tested the Samsung 850 Pro, 845DC Pro, and SM863.
All of my testing was "worst case IOPS" as described here:
http://www.anandtech.com/show/8319/samsung-ssd-845dc-evopro-preview-exploring-worstcase-iops/6
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Honestly, we are scared to try the same tests with the m600s. When we
first put them in, we had them more full, but we backed them off to
reduce the load on them. Based on that I don't expect them to fair any
better. We'd love to get more IOPs out of
Thank you for your very precious output.
"s3610s write iops high-load" is very interesting to me.
Have you every did any same test set of s3610s for m600s?
> These clusters normally service 12K IOPs with bursts up to 22K IOPs all RBD.
> I've seen a peak of 64K IOPs from client traffic.
That's pr
You need to make sure SSD O_DIRECT|O_DSYNC performance is good. Not all the
SSDs are good at it..Refer the prior discussions in the community for that.
<< Presumably as long as the SSD read speed exceeds that of the spinners, that
is sufficient.
You probably meant write speed of SSDs ? Journal w
Ignoring the durability and network issues for now :) Are there any
aspects of a journals performance that matter most for over all ceph
performance?
i.e my inital thought is if I want to improve ceph write performance
journal seq write speed is what matters. Does random write speed factor
at
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Resending sans attachment...
A picture is worth a thousand words:
http://robert.leblancnet.us/files/s3610-load-test-20160224.png
The red lines are the m600s IO time (dotted) and IOPs (solid) and our
baseline s3610s in green and our test set of s36
Thanks for your input.
I'm getting clear. It may be necessary to ask you more though -;
Rgds,
Shinobu
- Original Message -
From: "Josh Durgin"
To: "Shinobu Kinjo"
Cc: "Jan Schermer" , ceph-users@lists.ceph.com
Sent: Saturday, February 27, 2016 8:39:39 AM
Subject: Re: [ceph-users] Obser
On 02/26/2016 03:17 PM, Shinobu Kinjo wrote:
In jewel, as you mentioned, there will be "--max-objects" and "--object-size"
options.
That hint will go away or mitigate /w those options. Collect?
The io hint isn't sent by rados bench, just rbd. So even with those
options, rados bench still doesn
On 02/26/2016 02:27 PM, Shinobu Kinjo wrote:
In this case it's likely rados bench using tiny objects that's
causing the massive overhead. rados bench is doing each write to a new
object, which ends up in a new file beneath the osd, with its own
xattrs too. For 4k writes, that's a ton of overhead.
Thanks!
In jewel, as you mentioned, there will be "--max-objects" and "--object-size"
options.
That hint will go away or mitigate /w those options. Collect?
Are those options available in:
# ceph -v
ceph version 10.0.2 (86764eaebe1eda943c59d7d784b893ec8b0c6ff9)??
Rgds,
Shinobu
- Original
Hi Cephers
At the moment we are trying to recover our CEPH cluser (0.87) which is
behaving very odd.
What have been done :
1. OSD drive failure happened - CEPH put OSD down and out.
2. Physical HDD replaced and NOT added to CEPH - here we had strange
kernel crash just after HDD connected to th
Hello,
> We started having high wait times on the M600s so we got 6 S3610s, 6 M500dcs,
> and 6 500 GB M600s (they have the SLC to MLC conversion that we thought might
> work better).
Is it working better as you were expecting?
> We have graphite gathering stats on the admin sockets for Ceph a
Hi Cephers
At the moment we are trying to recover our CEPH cluser (0.87) which is
behaving very odd.
What have been done :
1. OSD drive failure happened - CEPH put OSD down and out.
2. Physical HDD replaced and NOT added to CEPH - here we had strange
kernel crash just after HDD connected to th
On 02/26/2016 01:42 PM, Jan Schermer wrote:
RBD backend might be even worse, depending on how large dataset you try. One 4KB
block can end up creating a 4MB object, and depending on how well hole-punching
and fallocate works on your system you could in theory end up with a >1000
amplification
On Sat, Feb 27, 2016 at 12:08 AM, Andy Allan wrote:
> When I made a (trivial, to be fair) documentation PR it was dealt with
> immediately, both when I opened it, and when I fixed up my commit
> message. I'd recommend that if anyone sees anything wrong with the
> docs, just submit a PR with the fi
On Fri, Feb 26, 2016 at 11:28 PM, John Spray wrote:
> Some projects have big angry warning banners at the top of their
> master branch documentation, I think perhaps we should do that too,
> and at the same time try to find a way to steer google hits to the
> latest stable branch docs rather than
> In this case it's likely rados bench using tiny objects that's
> causing the massive overhead. rados bench is doing each write to a new
> object, which ends up in a new file beneath the osd, with its own
> xattrs too. For 4k writes, that's a ton of overhead.
That means that we don't see any prop
On Fri, Feb 26, 2016 at 6:08 AM, Andy Allan wrote:
> including a nice big obvious version switcher banner on every
> page.
We used to have something like this, but we didn't set it back up when
we migrated the web servers to new infrastructure a while back. It was
using https://github.com/alfredo
RBD backend might be even worse, depending on how large dataset you try. One
4KB block can end up creating a 4MB object, and depending on how well
hole-punching and fallocate works on your system you could in theory end up
with a >1000 amplification if you always hit a different 4MB chunk (but t
On 02/24/2016 07:10 PM, Christian Balzer wrote:
10 second rados bench with 4KB blocks, 219MB written in total.
nand-writes per SSD:41*32MB=1312MB.
10496MB total written to all SSDs.
Amplification:48!!!
Le ouch.
In my use case with rbd cache on all VMs I expect writes to be rather
large for the m
Hello Nick,
On Fri, 26 Feb 2016 09:46:03 - Nick Fisk wrote:
> Hi Christian,
>
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Christian Balzer
> > Sent: 26 February 2016 09:07
> > To: ceph-users@lists.ceph.com
> > Subject: [cep
This Infernalis point release fixes several packagins and init script
issues, enables the librbd objectmap feature by default, a few librbd
bugs, and a range of miscellaneous bug fixes across the system.
We recommend that all infernalis v9.2.0 users upgrade.
For more detailed information, see t
Hi,
Maybe this is the reason of another bug?
http://tracker.ceph.com/issues/13764
The situation is very similiar...
--
Regards
Dominik
2016-02-25 16:17 GMT+01:00 Ritter Sławomir :
> Hi,
>
>
>
> We have two CEPH clusters running on Dumpling 0.67.11 and some of our
> "multipart objects" are incompl
> My guess would be that if you are already running hammer on the client it is
> already using the new watcher API. This would be a fix on the OSDs to allow
> the object to be moved because the current client is smart enough to try
> again. It would be watchers per object.
> Sent from a mobile devi
On 26 February 2016 at 05:53, Christian Balzer wrote:
> I have a feeling some dedicated editors including knowledgeable and vetted
> volunteers would do a better job that just spamming PRs, which tend to be
> forgotten/ignored by the already overworked devs.
When I made a (trivial, to be fair)
On Fri, Feb 26, 2016 at 5:24 AM, Nigel Williams
wrote:
> On Fri, Feb 26, 2016 at 4:09 PM, Adam Tygart wrote:
>> The docs are already split by version, although it doesn't help that
>> it isn't linked in an obvious manner.
>>
>> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/
>
>
On Fri, Feb 26, 2016 at 5:53 AM, Christian Balzer wrote:
>
> Hello,
>
> On Thu, 25 Feb 2016 23:09:52 -0600 Adam Tygart wrote:
>
>> The docs are already split by version, although it doesn't help that
>> it isn't linked in an obvious manner.
>>
>> http://docs.ceph.com/docs/master/rados/operations/c
Alexander,
> # ceph osd pool get-quota cache
> quotas for pool 'cache':
> max objects: N/A
> max bytes : N/A
> But I set target_max_bytes:
> # ceph osd pool set cache target_max_bytes 1
> Can it serve as the reason?
I've been unable to reproduce http://tracker.ceph.com/issues/13098
w
I can reproduce and updated the ticket. (I only upgraded the client,
not the server).
It seems to be related to the new --no-verify option, which is giving
strange results -- see the ticket.
-- Dan
On Fri, Feb 26, 2016 at 11:48 AM, Alexey Sheplyakov
wrote:
> Christian,
>
>> Note that "rand" wo
Christian,
> Note that "rand" works fine, as does "seq" on a 0.95.5 cluster.
Could you please check if 0.94.5 ("old") *client* works with 0.94.6
("new") servers, and vice a versa?
Best regards,
Alexey
On Fri, Feb 26, 2016 at 9:44 AM, Christian Balzer wrote:
>
> Hello,
>
> On my crappy te
Thanks Jan, that is an excellent explanation.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan
Schermer
Sent: 26 February 2016 10:07
To: Huan Zhang
Cc: josh durgin ; Nick Fisk ;
ceph-users
Subject: Re: [ceph-users] Guest sync write iops so poor.
O_DIRECT is _no
Also take a look at Galera cluster. You can relax flushing to disk as long as
all your nodes don't go down at the same time.
(And when a node goes back up after a crash you should trash it before it
rejoins the cluster)
Jan
> On 26 Feb 2016, at 11:01, Nick Fisk wrote:
>
> I guess my question
O_DIRECT is _not_ a flag for synchronous blocking IO.
O_DIRECT only hints the kernel that it needs not cache/buffer the data.
The kernel is actually free to buffer and cache it and it does buffer it.
It also does _not_ flush O_DIRECT writes to disk but it makes best effort to
send it to the drives
I guess my question was more around what does your final workload look like, if
it’s the same as the SQL benchmarks then you are not going to get much better
performance than what you do now, aside from trying some of the tuning options
I mentioned which might get you an extra 100iops.
The only
fio /dev/rbd0 sync=1 has no problem.
Doesn't find 'sync cache code' in linux rbd block driver and radosgw api.
Seems sync cache is just the concept of librbd (for rbd cache).
Just my concerns.
2016-02-26 17:30 GMT+08:00 Huan Zhang :
> Hi Nick,
> DB's IO pattern depends on config, mysql for exampl
Hi Christian,
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Christian Balzer
> Sent: 26 February 2016 09:07
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Cache tier weirdness
>
>
> Hello,
>
> still my test cluster with 0.94.6
Hi Nick,
DB's IO pattern depends on config, mysql for example.
innodb_flush_log_at_trx_commit =1, mysql will sync after one transcation.
like:
write
sync
wirte
sync
...
innodb_flush_log_at_trx_commit = 5,
write
write
write
write
write
sync
innodb_flush_log_at_trx_commit = 0,
write
write
...
one s
Hello,
still my test cluster with 0.94.6.
It's a bit fuzzy, but I don't think I saw this with Firefly, but then
again that is totally broken when it comes to cache tiers (switching
between writeback and forward mode).
goat is a cache pool for rbd:
---
# ceph osd pool ls detail
pool 2 'rbd' repl
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Huan Zhang
> Sent: 26 February 2016 06:50
> To: Jason Dillaman
> Cc: josh durgin ; Nick Fisk ;
> ceph-users
> Subject: Re: [ceph-users] Guest sync write iops so poor.
>
> rbd engine with fsy
41 matches
Mail list logo