Re: [ceph-users] Crush rule check

2016-12-12 Thread Wido den Hollander
> Op 10 december 2016 om 12:45 schreef Adrian Saul > : > > > > Hi Ceph-users, > I just want to double check a new crush ruleset I am creating - the intent > here is that over 2 DCs, it will select one DC, and place two copies on > separate hosts in that DC. The pools created on this will

Re: [ceph-users] rsync kernel client cepfs mkstemp no space left on device

2016-12-12 Thread Mike Miller
John, thanks for emphasizing this, before this workaround we tried many different kernel versions including 4.5.x, all the same. The problem might be particular to our environment as most of the client machines (compute servers) have large RAM, so plenty of cache space for inodes/dentries.

Re: [ceph-users] Crush rule check

2016-12-12 Thread Adrian Saul
Thanks Wido. I had found the show-utilization test, but had not seen show-mappings - that confirmed it for me. thanks, Adrian > -Original Message- > From: Wido den Hollander [mailto:w...@42on.com] > Sent: Monday, 12 December 2016 7:07 PM > To: ceph-users@lists.ceph.com; Adrian Saul >

Re: [ceph-users] Crush rule check

2016-12-12 Thread Wido den Hollander
> Op 12 december 2016 om 9:08 schreef Adrian Saul > : > > > > Thanks Wido. > > I had found the show-utilization test, but had not seen show-mappings - that > confirmed it for me. > One thing to check though. The number of DCs is a fixed number right? You will always have two DCs with X ho

Re: [ceph-users] Crush rule check

2016-12-12 Thread Adrian Saul
> One thing to check though. The number of DCs is a fixed number right? You > will always have two DCs with X hosts. I am keeping it open in case we add other sites for some reason, but likely to remain at 2. > > In that case: > > step choose firstn 2 type datacenter > step chooseleaf first

Re: [ceph-users] 2x replication: A BIG warning

2016-12-12 Thread Oliver Humpage
> On 12 Dec 2016, at 07:59, Wido den Hollander wrote: > > As David already said, when all OSDs are up and in for a PG Ceph will wait > for ALL OSDs to Ack the write. Writes in RADOS are always synchronous. Apologies, I missed that. Clearly I’ve been misunderstanding min_size for a while then:

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-12 Thread John Spray
On Sat, Dec 10, 2016 at 1:50 PM, Sean Redmond wrote: > Hi Goncarlo, > > With the output from "ceph tell mds.0 damage ls" we tracked the inodes of > two damaged directories using 'find /mnt/ceph/ -inum $inode', after > reviewing the paths involved we confirmed a backup was availble for this > data

Re: [ceph-users] How to start/restart osd and mon manually (not by init script or systemd)

2016-12-12 Thread Craig Chi
Hi Wang, Did you ever check if there are error logs in /var/log/ceph/ceph-mon.xt2.log or /var/log/ceph/ceph-osd.0.log? BTW, just out of curiosity, why don't you just use systemd to start your osd and mon? systemd handles automatically restart of the processes after few kinds of simple failure.

[ceph-users] OSDs cpu usage

2016-12-12 Thread George Kissandrakis
Hi I have a jewel/xenial ceph installation with 61 OSDs mixed sas/sata in hosts with two roots The installation has version jewel 10.2.3-1xenial (and monitors) Two hosts where newly added and the version jewel 10.2.4-1xenial as installed These two hosts' with the newer packages, ceph-osd

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-12 Thread Sean Redmond
Hey John, Thanks for your response here. We took the below action on the journal as a method to move past hitting the mds assert initially: #cephfs-journal-tool journal export backup.bin (This commands failed We suspect due to corruption) #cephfs-journal-tool event recover_dentries summary, thi

Re: [ceph-users] Server crashes on high mount volume

2016-12-12 Thread Ken Dreyer
On Fri, Dec 9, 2016 at 2:28 PM, Diego Castro wrote: > Hello, my case is very specific but i think other may have this issue. > > I have a ceph cluster up and running hosting block storage for my openshift > (kubernetes) cluster. > Things goes bad when i "evacuate" a node, which is move all contain

Re: [ceph-users] [EXTERNAL] Ceph performance is too good (impossible..)...

2016-12-12 Thread Will . Boege
My understanding is that when using direct=1 on a raw block device FIO (aka-you) will have to handle all the sector alignment or the request will get buffered to perform the alignment. Try adding the –blockalign=512b option to your jobs, or better yet just use the native FIO RBD engine. Someth

Re: [ceph-users] OSDs cpu usage

2016-12-12 Thread ulembke
Hi, update to 10.2.5 - available since saturday. Udo Am 2016-12-12 13:40, schrieb George Kissandrakis: Hi I have a jewel/xenial ceph installation with 61 OSDs mixed sas/sata in hosts with two roots The installation has version jewel 10.2.3-1xenial (and monitors) Two hosts where newly a

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-12 Thread ulembke
Hi, if you wrote from an client, the data was written in an (or more) Placement Group in 4MB-Chunks. This PGs are written to journal and the osd-disk and due this the data are in the linux file buffer on the osd-node too (until the os need the storage for other data (file buffer or anything el

Re: [ceph-users] OSDs cpu usage

2016-12-12 Thread George Kissandrakis
I saw that 10.2.5 is out but if a bug appeared on 10.2.4 would that have been fixed in 10.2.5, or just upgrade and hope for the best? George Kissandrakis Senior Infrastructure Engineer +49 891200 9831 www.mapp.com LinkedIn | Twitter | Facebook This e-mail is from Mapp Digital, LLC and it's in

Re: [ceph-users] OSDs cpu usage

2016-12-12 Thread David Riedl
10.2.5 exists because of this bug. Here are the patch notes http://docs.ceph.com/docs/master/release-notes/#v10-2-5-jewel Regards David Am 12.12.2016 um 17:09 schrieb George Kissandrakis: I saw that 10.2.5 is out but if a bug appeared on 10.2.4 would that have been fixed in 10.2.5, or just

Re: [ceph-users] OSDs cpu usage

2016-12-12 Thread George Kissandrakis
Seems solved with 10.2.5 Thank you George Kissandrakis Senior Infrastructure Engineer +49 891200 9831 www.mapp.com LinkedIn | Twitter |

[ceph-users] Looking for a definition for some undocumented variables

2016-12-12 Thread Jake Young
I've seen these referenced a few times in the mailing list, can someone explain what they do exactly? What are the defaults for these values? osd recovery sleep and osd recover max single start Thanks! Jake ___ ceph-users mailing list ceph-users@lis

Re: [ceph-users] Looking for a definition for some undocumented variables

2016-12-12 Thread John Spray
On Mon, Dec 12, 2016 at 5:23 PM, Jake Young wrote: > I've seen these referenced a few times in the mailing list, can someone > explain what they do exactly? > > What are the defaults for these values? > > osd recovery sleep > > and > > osd recover max single start Aside from the definition, you c

Re: [ceph-users] Looking for a definition for some undocumented variables

2016-12-12 Thread Jake Young
Thanks John, To partially answer my own question: OPTION(osd_recovery_sleep, OPT_FLOAT, 0) // seconds to sleep between recovery ops OPTION(osd_recovery_max_single_start, OPT_U64, 1) Funny, in the examples where I've seen osd_recovery_max_single_start it is being set to 1, which is the default.

Re: [ceph-users] A question about io consistency in osd down case

2016-12-12 Thread Jason Dillaman
On Sun, Dec 11, 2016 at 10:48 PM, zhong-yan.gu wrote: > Hi Jason, > After reviewing the code, my understanding is that: > case1: Primary osd A, replica osd B, replica osd C, min size is 2, during > IO, C is down. > 1. A and B journal writes are done, C failed and down. > 2. IO waiting for osd_hear

[ceph-users] Red Hat Summit CFP Closing

2016-12-12 Thread Patrick McGarry
Hey cephers, Just a friendly reminder that the CFP for Red Hat summit is quickly coming to a close. This year the "upstream" talks are integrated into the main conference (instead of being relegated to their own attached mini-conf), so they will get much more traction. If you would like to submit

Re: [ceph-users] Server crashes on high mount volume

2016-12-12 Thread Diego Castro
I didn't have a try, i'll let you know how did it goes.. Thank you ! --- Diego Castro / The CloudFather GetupCloud.com - Eliminamos a Gravidade 2016-12-12 12:02 GMT-03:00 Ken Dreyer : > On Fri, Dec 9, 2016 at 2:28 PM, Diego Castro > wrote: > > Hello, my case is very specific but i think other

Re: [ceph-users] Server crashes on high mount volume

2016-12-12 Thread Ilya Dryomov
On Mon, Dec 12, 2016 at 9:16 PM, Diego Castro wrote: > I didn't have a try, i'll let you know how did it goes.. This should be fixed by commit [1] upstream and it was indeed backported to 7.3. [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=811c6688774613a78bfa020

[ceph-users] What happens if all replica OSDs journals are broken?

2016-12-12 Thread Kevin Olbrich
Hi, just in case: What happens when all replica journal SSDs are broken at once? The PGs most likely will be stuck inactive but as I read, the journals just need to be replaced (http://ceph.com/planet/ceph-recover-osds-after-ssd- journal-failure/). Does this also work in this case? Kind regards,

Re: [ceph-users] What happens if all replica OSDs journals are broken?

2016-12-12 Thread Christian Balzer
On Mon, 12 Dec 2016 22:41:41 +0100 Kevin Olbrich wrote: > Hi, > > just in case: What happens when all replica journal SSDs are broken at once? > That would be bad, as in BAD. In theory you just "lost" all the associated OSDs and their data. In practice everything but in the in-flight data at th

Re: [ceph-users] A question about io consistency in osd down case

2016-12-12 Thread Shinobu Kinjo
On Sat, Dec 10, 2016 at 11:00 PM, Jason Dillaman wrote: > I should clarify that if the OSD has silently failed (e.g. the TCP > connection wasn't reset and packets are just silently being dropped / > not being acked), IO will pause for up to "osd_heartbeat_grace" before The number is how long an O

Re: [ceph-users] Looking for a definition for some undocumented variables

2016-12-12 Thread Brad Hubbard
On Tue, Dec 13, 2016 at 3:56 AM, Jake Young wrote: > Thanks John, > > To partially answer my own question: > > OPTION(osd_recovery_sleep, OPT_FLOAT, 0) // seconds to sleep between > recovery ops > > OPTION(osd_recovery_max_single_start, OPT_U64, 1) > > Funny, in the examples where I've seen osd_re

Re: [ceph-users] [EXTERNAL] Ceph performance is too good (impossible..)...

2016-12-12 Thread V Plus
The same.. see: A: (g=0): rw=read, bs=5M-5M/5M-5M/5M-5M, ioengine=*libaio*, iodepth=1 ... fio-2.2.10 Starting 16 processes A: (groupid=0, jobs=16): err= 0: pid=27579: Mon Dec 12 20:36:10 2016 mixed: io=122515MB, bw=6120.3MB/s, iops=1224, runt= 20018msec I think at the end, the only one way to s

[ceph-users] v11.1.0 kraken candidate released

2016-12-12 Thread Abhishek L
Hi everyone, This is the first release candidate for Kraken, the next stable release series. There have been major changes from jewel with many features being added. Please note the upgrade process from jewel, before upgrading. Major Changes from Jewel - *RADOS*: * Th

Re: [ceph-users] v11.1.0 kraken candidate released

2016-12-12 Thread Ben Hines
Hi! Can you clarify whether this release note applies to Jewel upgrades only? Ie, can we go Infernalis -> Kraken? It is in the 'upgrading from jewel' section which would imply that it doesn't apply to Infernalis -> Kraken. (or any other version to kraken), but it does say 'All clusters'. Upgrading

[ceph-users] Revisiting: Many clients (X) failing to respond to cache pressure

2016-12-12 Thread Goncalo Borges
Hi Ceph(FS)ers... I am currently running in production the following environment: - ceph/cephfs in 10.2.2. - All infrastructure is in the same version (rados cluster, mons, mds and cephfs clients). - We mount cephfs using ceph-fuse. Since yesterday that we have our cluster in warnin

Re: [ceph-users] v11.1.0 kraken candidate released

2016-12-12 Thread Ben Hines
It looks like the second releasenote in that section answers my question. sortbitwise is only supported in jewel, and it's required to be already set for Kraken upgraded OSDs to even start up, so one must go to Jewel first. The section heading should probably say just "Upgrading to Kraken" rather

Re: [ceph-users] Wrong pg count when pg number is large

2016-12-12 Thread Gregory Farnum
On Thu, Dec 1, 2016 at 8:35 AM, Craig Chi wrote: > Hi list, > > I am testing the Ceph cluster with unpractical pg numbers to do some > experiments. > > But when I use ceph -w to watch my cluster status, I see pg numbers doubled. > From my ceph -w > > root@mon1:~# ceph -w > cluster 1c33bf75-e08

Re: [ceph-users] osd down detection broken in jewel?

2016-12-12 Thread Gregory Farnum
On Wed, Nov 30, 2016 at 8:31 AM, Manuel Lausch wrote: > Yes. This parameter is used in the condition described there: > http://docs.ceph.com/docs/jewel/rados/configuration/mon- > osd-interaction/#osds-report-their-status and works. I think the default > timeout of 900s is quiet a bit large. > > A

Re: [ceph-users] Ceph Fuse Strange Behavior Very Strange

2016-12-12 Thread Gregory Farnum
On Sat, Dec 3, 2016 at 10:54 PM, Winger Cheng wrote: > Hi, all: > I have two small test on our cephfs cluster: > > time for i in {1..1}; do echo hello > file${i}; done && time rm * && > time for i in {1..1}; do echo hello > file${i}; done && time rm * > > Client A : use kernel

Re: [ceph-users] Wrong pg count when pg number is large

2016-12-12 Thread Craig Chi
Hi Greg, Sorry I didn't reserve the environment due to urgent needs. However I think you are right because at that time I just purged all pools and re-create them in a short time, thank you very much! Sincerely, Craig Chi On 2016-12-13 14:21, Gregory Farnumwrote: > On Thu, Dec 1, 2016 at 8:35

Re: [ceph-users] v11.1.0 kraken candidate released

2016-12-12 Thread Dietmar Rieder
Hi, this is good news! Thanks. As far as I see the RBD supports (experimentally) now EC data pools. Is this true also for CephFS? It is not stated in the announce, so I wonder if and when EC pools are planned to be supported by CephFS. ~regards Dietmar On 12/13/2016 03:28 AM, Abhishek L wrote

[ceph-users] can cache-mode be set to readproxy for tier cache with ceph 0.94.9 ?

2016-12-12 Thread JiaJia Zhong
hi cephers: we are using ceph hammer 0.94.9, yes, It's not the latest ( jewel), with some ssd osds for tiering, cache-mode is set to readproxy, everything seems to be as expected, but when reading some small files from cephfs, we got 0 bytes. I did some search and got the be