date:20160226

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Nick Fisk

> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Huan Zhang > Sent: 26 February 2016 06:50 > To: Jason Dillaman > Cc: josh durgin ; Nick Fisk ; > ceph-users > Subject: Re: [ceph-users] Guest sync write iops so poor. > > rbd engine with fsy

[ceph-users] Cache tier weirdness

2016-02-26 Thread Christian Balzer

Hello, still my test cluster with 0.94.6. It's a bit fuzzy, but I don't think I saw this with Firefly, but then again that is totally broken when it comes to cache tiers (switching between writeback and forward mode). goat is a cache pool for rbd: --- # ceph osd pool ls detail pool 2 'rbd' repl

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Huan Zhang

Hi Nick, DB's IO pattern depends on config, mysql for example. innodb_flush_log_at_trx_commit =1, mysql will sync after one transcation. like: write sync wirte sync ... innodb_flush_log_at_trx_commit = 5, write write write write write sync innodb_flush_log_at_trx_commit = 0, write write ... one s

Re: [ceph-users] Cache tier weirdness

2016-02-26 Thread Nick Fisk

Hi Christian, > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 26 February 2016 09:07 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Cache tier weirdness > > > Hello, > > still my test cluster with 0.94.6

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Huan Zhang

fio /dev/rbd0 sync=1 has no problem. Doesn't find 'sync cache code' in linux rbd block driver and radosgw api. Seems sync cache is just the concept of librbd (for rbd cache). Just my concerns. 2016-02-26 17:30 GMT+08:00 Huan Zhang : > Hi Nick, > DB's IO pattern depends on config, mysql for exampl

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Nick Fisk

I guess my question was more around what does your final workload look like, if it’s the same as the SQL benchmarks then you are not going to get much better performance than what you do now, aside from trying some of the tuning options I mentioned which might get you an extra 100iops. The only

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Jan Schermer

O_DIRECT is _not_ a flag for synchronous blocking IO. O_DIRECT only hints the kernel that it needs not cache/buffer the data. The kernel is actually free to buffer and cache it and it does buffer it. It also does _not_ flush O_DIRECT writes to disk but it makes best effort to send it to the drives

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Jan Schermer

Also take a look at Galera cluster. You can relax flushing to disk as long as all your nodes don't go down at the same time. (And when a node goes back up after a crash you should trash it before it rejoins the cluster) Jan > On 26 Feb 2016, at 11:01, Nick Fisk wrote: > > I guess my question

Re: [ceph-users] Guest sync write iops so poor.

2016-02-26 Thread Nick Fisk

Thanks Jan, that is an excellent explanation. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 26 February 2016 10:07 To: Huan Zhang Cc: josh durgin ; Nick Fisk ; ceph-users Subject: Re: [ceph-users] Guest sync write iops so poor. O_DIRECT is _no

Re: [ceph-users] Bug in rados bench with 0.94.6 (regression, not present in 0.94.5)

2016-02-26 Thread Alexey Sheplyakov

Christian, > Note that "rand" works fine, as does "seq" on a 0.95.5 cluster. Could you please check if 0.94.5 ("old") *client* works with 0.94.6 ("new") servers, and vice a versa? Best regards, Alexey On Fri, Feb 26, 2016 at 9:44 AM, Christian Balzer wrote: > > Hello, > > On my crappy te

Re: [ceph-users] Bug in rados bench with 0.94.6 (regression, not present in 0.94.5)

2016-02-26 Thread Dan van der Ster

I can reproduce and updated the ticket. (I only upgraded the client, not the server). It seems to be related to the new --no-verify option, which is giving strange results -- see the ticket. -- Dan On Fri, Feb 26, 2016 at 11:48 AM, Alexey Sheplyakov wrote: > Christian, > >> Note that "rand" wo

Re: [ceph-users] OSDs are crashing during PG replication

2016-02-26 Thread Alexey Sheplyakov

Alexander, > # ceph osd pool get-quota cache > quotas for pool 'cache': > max objects: N/A > max bytes : N/A > But I set target_max_bytes: > # ceph osd pool set cache target_max_bytes 1 > Can it serve as the reason? I've been unable to reproduce http://tracker.ceph.com/issues/13098 w

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread John Spray

On Fri, Feb 26, 2016 at 5:53 AM, Christian Balzer wrote: > > Hello, > > On Thu, 25 Feb 2016 23:09:52 -0600 Adam Tygart wrote: > >> The docs are already split by version, although it doesn't help that >> it isn't linked in an obvious manner. >> >> http://docs.ceph.com/docs/master/rados/operations/c

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread John Spray

On Fri, Feb 26, 2016 at 5:24 AM, Nigel Williams wrote: > On Fri, Feb 26, 2016 at 4:09 PM, Adam Tygart wrote: >> The docs are already split by version, although it doesn't help that >> it isn't linked in an obvious manner. >> >> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ > >

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread Andy Allan

On 26 February 2016 at 05:53, Christian Balzer wrote: > I have a feeling some dedicated editors including knowledgeable and vetted > volunteers would do a better job that just spamming PRs, which tend to be > forgotten/ignored by the already overworked devs. When I made a (trivial, to be fair)

Re: [ceph-users] Can not disable rbd cache

2016-02-26 Thread Jason Dillaman

> My guess would be that if you are already running hammer on the client it is > already using the new watcher API. This would be a fix on the OSDs to allow > the object to be moved because the current client is smart enough to try > again. It would be watchers per object. > Sent from a mobile devi

Re: [ceph-users] Problem: silently corrupted RadosGW objects caused by slow requests

2016-02-26 Thread Dominik Mostowiec

Hi, Maybe this is the reason of another bug? http://tracker.ceph.com/issues/13764 The situation is very similiar... -- Regards Dominik 2016-02-25 16:17 GMT+01:00 Ritter Sławomir : > Hi, > > > > We have two CEPH clusters running on Dumpling 0.67.11 and some of our > "multipart objects" are incompl

[ceph-users] v9.2.1 Infernalis released

2016-02-26 Thread Sage Weil

This Infernalis point release fixes several packagins and init script issues, enables the librbd objectmap feature by default, a few librbd bugs, and a range of miscellaneous bug fixes across the system. We recommend that all infernalis v9.2.0 users upgrade. For more detailed information, see t

Re: [ceph-users] Cache tier weirdness

2016-02-26 Thread Christian Balzer

Hello Nick, On Fri, 26 Feb 2016 09:46:03 - Nick Fisk wrote: > Hi Christian, > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Christian Balzer > > Sent: 26 February 2016 09:07 > > To: ceph-users@lists.ceph.com > > Subject: [cep

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Josh Durgin

On 02/24/2016 07:10 PM, Christian Balzer wrote: 10 second rados bench with 4KB blocks, 219MB written in total. nand-writes per SSD:41*32MB=1312MB. 10496MB total written to all SSDs. Amplification:48!!! Le ouch. In my use case with rbd cache on all VMs I expect writes to be rather large for the m

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Jan Schermer

RBD backend might be even worse, depending on how large dataset you try. One 4KB block can end up creating a 4MB object, and depending on how well hole-punching and fallocate works on your system you could in theory end up with a >1000 amplification if you always hit a different 4MB chunk (but t

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread Ken Dreyer

On Fri, Feb 26, 2016 at 6:08 AM, Andy Allan wrote: > including a nice big obvious version switcher banner on every > page. We used to have something like this, but we didn't set it back up when we migrated the web servers to new infrastructure a while back. It was using https://github.com/alfredo

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Shinobu Kinjo

> In this case it's likely rados bench using tiny objects that's > causing the massive overhead. rados bench is doing each write to a new > object, which ends up in a new file beneath the osd, with its own > xattrs too. For 4k writes, that's a ton of overhead. That means that we don't see any prop

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread Nigel Williams

On Fri, Feb 26, 2016 at 11:28 PM, John Spray wrote: > Some projects have big angry warning banners at the top of their > master branch documentation, I think perhaps we should do that too, > and at the same time try to find a way to steer google hits to the > latest stable branch docs rather than

Re: [ceph-users] State of Ceph documention

2016-02-26 Thread Nigel Williams

On Sat, Feb 27, 2016 at 12:08 AM, Andy Allan wrote: > When I made a (trivial, to be fair) documentation PR it was dealt with > immediately, both when I opened it, and when I fixed up my commit > message. I'd recommend that if anyone sees anything wrong with the > docs, just submit a PR with the fi

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Josh Durgin

On 02/26/2016 01:42 PM, Jan Schermer wrote: RBD backend might be even worse, depending on how large dataset you try. One 4KB block can end up creating a 4MB object, and depending on how well hole-punching and fallocate works on your system you could in theory end up with a >1000 amplification

[ceph-users] Old CEPH (0.87) cluster degradation - putting OSDs down one by one

2016-02-26 Thread maxxik

Hi Cephers At the moment we are trying to recover our CEPH cluser (0.87) which is behaving very odd. What have been done : 1. OSD drive failure happened - CEPH put OSD down and out. 2. Physical HDD replaced and NOT added to CEPH - here we had strange kernel crash just after HDD connected to th

Re: [ceph-users] List of SSDs

2016-02-26 Thread Shinobu Kinjo

Hello, > We started having high wait times on the M600s so we got 6 S3610s, 6 M500dcs, > and 6 500 GB M600s (they have the SLC to MLC conversion that we thought might > work better). Is it working better as you were expecting? > We have graphite gathering stats on the admin sockets for Ceph a

[ceph-users] Old CEPH (0.87) cluster degradation - putting OSDs down one by one

2016-02-26 Thread maxxik

Hi Cephers At the moment we are trying to recover our CEPH cluser (0.87) which is behaving very odd. What have been done : 1. OSD drive failure happened - CEPH put OSD down and out. 2. Physical HDD replaced and NOT added to CEPH - here we had strange kernel crash just after HDD connected to th

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Shinobu Kinjo

Thanks! In jewel, as you mentioned, there will be "--max-objects" and "--object-size" options. That hint will go away or mitigate /w those options. Collect? Are those options available in: # ceph -v ceph version 10.0.2 (86764eaebe1eda943c59d7d784b893ec8b0c6ff9)?? Rgds, Shinobu - Original

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Josh Durgin

On 02/26/2016 02:27 PM, Shinobu Kinjo wrote: In this case it's likely rados bench using tiny objects that's causing the massive overhead. rados bench is doing each write to a new object, which ends up in a new file beneath the osd, with its own xattrs too. For 4k writes, that's a ton of overhead.

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Josh Durgin

On 02/26/2016 03:17 PM, Shinobu Kinjo wrote: In jewel, as you mentioned, there will be "--max-objects" and "--object-size" options. That hint will go away or mitigate /w those options. Collect? The io hint isn't sent by rados bench, just rbd. So even with those options, rados bench still doesn

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-26 Thread Shinobu Kinjo

Thanks for your input. I'm getting clear. It may be necessary to ask you more though -; Rgds, Shinobu - Original Message - From: "Josh Durgin" To: "Shinobu Kinjo" Cc: "Jan Schermer" , ceph-users@lists.ceph.com Sent: Saturday, February 27, 2016 8:39:39 AM Subject: Re: [ceph-users] Obser

Re: [ceph-users] List of SSDs

2016-02-26 Thread Robert LeBlanc

-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Resending sans attachment... A picture is worth a thousand words: http://robert.leblancnet.us/files/s3610-load-test-20160224.png The red lines are the m600s IO time (dotted) and IOPs (solid) and our baseline s3610s in green and our test set of s36

[ceph-users] SSD Journal Performance Priorties

2016-02-26 Thread Lindsay Mathieson

Ignoring the durability and network issues for now :) Are there any aspects of a journals performance that matter most for over all ceph performance? i.e my inital thought is if I want to improve ceph write performance journal seq write speed is what matters. Does random write speed factor at

Re: [ceph-users] SSD Journal Performance Priorties

2016-02-26 Thread Somnath Roy

You need to make sure SSD O_DIRECT|O_DSYNC performance is good. Not all the SSDs are good at it..Refer the prior discussions in the community for that. << Presumably as long as the SSD read speed exceeds that of the spinners, that is sufficient. You probably meant write speed of SSDs ? Journal w

Re: [ceph-users] List of SSDs

2016-02-26 Thread Shinobu Kinjo

Thank you for your very precious output. "s3610s write iops high-load" is very interesting to me. Have you every did any same test set of s3610s for m600s? > These clusters normally service 12K IOPs with bursts up to 22K IOPs all RBD. > I've seen a peak of 64K IOPs from client traffic. That's pr

Re: [ceph-users] List of SSDs

2016-02-26 Thread Robert LeBlanc

-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Honestly, we are scared to try the same tests with the m600s. When we first put them in, we had them more full, but we backed them off to reduce the load on them. Based on that I don't expect them to fair any better. We'd love to get more IOPs out of

Re: [ceph-users] List of SSDs

2016-02-26 Thread Heath Albritton

I've done a bit of testing with the Intel units: S3600, S3700, S3710, and P3700. I've also tested the Samsung 850 Pro, 845DC Pro, and SM863. All of my testing was "worst case IOPS" as described here: http://www.anandtech.com/show/8319/samsung-ssd-845dc-evopro-preview-exploring-worstcase-iops/6

Re: [ceph-users] xfs corruption

2016-02-26 Thread fangchen sun

Thank you for your response! All my hosts have raid cards. Some raid cards are in pass-throughput mode, and the others are in write-back mode. I will set all raid cards pass-throughput mode and observe for a period of time. Best Regards sunspot 2016-02-25 20:07 GMT+08:00 Ferhat Ozkasgarli : >

Re: [ceph-users] List of SSDs

2016-02-26 Thread Shinobu Kinjo

> Honestly, we are scared to try the same tests with the m600s. When we > first put them in, we had them more full, but we backed them off to > reduce the load on them. I see. Did you tune anything on linux layer like: vm.vfs_cache_pressure It may not be necessary to mention specifically since

41 matches

Mail list logo