Re: [ceph-users] failing to respond to cache pressure

2018-09-07 Thread Zhenshi Zhou
Hi Eugen, Thanks for the update. The message still appears in the logs these days. Option client_oc_size in my cluster is 100MB from the start. I have configured mds_cache_memory_limit to 4G and from then on the message reduced. What I noticed is that the mds task reserves 6G memory(in top) whil

Re: [ceph-users] failing to respond to cache pressure

2018-09-06 Thread Eugen Block
Hi, I would like to update this thread for others struggling with cache pressure. The last time we hit that message was more than three weeks ago (workload has not changed), so it seems as our current configuration is fitting our workload. Reducing client_oc_size to 100 MB (from default 200

Re: [ceph-users] failing to respond to cache pressure

2018-08-23 Thread Eugen Block
Hi, I think it does have positive effect on the messages. Cause I get fewer messages than before. that's nice. I also receive definitely less cache pressure messages than before. I also started to play around with the client side cache configuration. I halved the client object cache size f

Re: [ceph-users] failing to respond to cache pressure

2018-08-20 Thread Zhenshi Zhou
Hi Eugen, I think it does have positive effect on the messages. Cause I get fewer messages than before. Eugen Block 于2018年8月20日周一 下午9:29写道: > Update: we are getting these messages again. > > So the search continues... > > > Zitat von Eugen Block : > > > Hi, > > > > Depending on your kernel (memo

Re: [ceph-users] failing to respond to cache pressure

2018-08-20 Thread Eugen Block
Update: we are getting these messages again. So the search continues... Zitat von Eugen Block : Hi, Depending on your kernel (memory leaks with CephFS) increasing the mds_cache_memory_limit could be of help. What is your current setting now? ceph:~ # ceph daemon mds. config show | grep

Re: [ceph-users] failing to respond to cache pressure

2018-08-16 Thread Eugen Block
Hi, currently our ceph servers use 4.4.104, our clients mostly have newer versions, something like 4.4.126. I set mds_cache_memory_limit from 1G to 2G, and then to 4G. I still get the warning messages, and the messages would disappear in 1 or 2 minutes. Did at least the number of clients d

Re: [ceph-users] failing to respond to cache pressure

2018-08-16 Thread Zhenshi Zhou
Hi, Eugen, I set mds_cache_memory_limit from 1G to 2G, and then to 4G. I still get the warning messages, and the messages would disappear in 1 or 2 minutes. Which version do your kernels use? Zhenshi Zhou 于2018年8月13日周一 下午10:15写道: > Hi Eugen, > The command shows "mds_cache_memory_limit": "1073741

Re: [ceph-users] failing to respond to cache pressure

2018-08-13 Thread Zhenshi Zhou
Hi Wido, The server and client use the same version, 12.2.5. And the command shows: # ceph versions { "mon": { "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)": 3 }, "mgr": { "ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a

Re: [ceph-users] failing to respond to cache pressure

2018-08-13 Thread Zhenshi Zhou
Hi Eugen, The command shows "mds_cache_memory_limit": "1073741824". And I'll increase the cache size for a try. Thanks Eugen Block 于2018年8月13日周一 下午9:48写道: > Hi, > > Depending on your kernel (memory leaks with CephFS) increasing the > mds_cache_memory_limit could be of help. What is your current

Re: [ceph-users] failing to respond to cache pressure

2018-08-13 Thread Eugen Block
Hi, Depending on your kernel (memory leaks with CephFS) increasing the mds_cache_memory_limit could be of help. What is your current setting now? ceph:~ # ceph daemon mds. config show | grep mds_cache_memory_limit We had these messages for months, almost every day. It would occur when hour

Re: [ceph-users] failing to respond to cache pressure

2018-08-13 Thread Wido den Hollander
On 08/13/2018 01:22 PM, Zhenshi Zhou wrote: > Hi, > Recently, the cluster runs healthy, but I get warning messages everyday: > Which version of Ceph? Which version of clients? Can you post: $ ceph versions $ ceph features $ ceph fs status Wido > 2018-08-13 17:39:23.682213 [INF]  Cluster is

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Brett Niver
Hi Oliver, Our corresponding RHCS downstream release of CephFS will be labeled "Tech Preview" which means its unsupported, but we believe that it's stable enough for experimentation. When we do release Cephfs as "production ready" that means we've done even more exhaustive testing and that this i

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Mark Nelson
ent: Monday, May 16, 2016 7:36 AM To: Andrus, Brian Contractor Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] failing to respond to cache pressure On Mon, May 16, 2016 at 3:11 PM, Andrus, Brian Contractor wrote: Both client and server are Jewel 10.2.0 So the fuse client, correct? If

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread Oliver Dzombic
Hi Brett, aside from the question if what Brian experience has anything to do with code stability: since this is new for me, that there is a difference between "stable" and "production ready" i would be happy if you could tell me how the table looks like. One of the team was joking something lik

Re: [ceph-users] failing to respond to cache pressure

2016-05-17 Thread John Spray
School > Monterey, California > voice: 831-656-6238 > > > > > > > -Original Message----- > From: John Spray [mailto:jsp...@redhat.com] > Sent: Monday, May 16, 2016 7:36 AM > To: Andrus, Brian Contractor > Cc: ceph-users@lists.ceph.com > Subject: Re: [c

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread Andrus, Brian Contractor
p...@redhat.com] Sent: Monday, May 16, 2016 7:36 AM To: Andrus, Brian Contractor Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] failing to respond to cache pressure On Mon, May 16, 2016 at 3:11 PM, Andrus, Brian Contractor wrote: > Both client and server are Jewel 10.2.0 So the fuse

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread Dan van der Ster
On 16 May 2016 16:36, "John Spray" wrote: > > On Mon, May 16, 2016 at 3:11 PM, Andrus, Brian Contractor > wrote: > > Both client and server are Jewel 10.2.0 > > So the fuse client, correct? If you are up for investigating further, > with potential client bugs (or performance issues) it is often

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread Mark Nelson
nt: Monday, May 16, 2016 2:28 AM To: Andrus, Brian Contractor Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] failing to respond to cache pressure On Mon, May 16, 2016 at 5:42 AM, Andrus, Brian Contractor wrote: So this ‘production ready’ CephFS for jewel seems a little not quite…. Curren

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread John Spray
hether you're seeing expected performance or you're seeing the outcome of a bug. John > > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 831-656-6238 > > > > > -Original Message- >

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread Andrus, Brian Contractor
ailto:jsp...@redhat.com] Sent: Monday, May 16, 2016 2:28 AM To: Andrus, Brian Contractor Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] failing to respond to cache pressure On Mon, May 16, 2016 at 5:42 AM, Andrus, Brian Contractor wrote: > So this ‘production ready’ CephFS f

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread Brett Niver
The terminology we're using to describe CephFS in Jewel is "stable" as opposed to production ready. Thanks, Brett On Monday, May 16, 2016, John Spray wrote: > On Mon, May 16, 2016 at 5:42 AM, Andrus, Brian Contractor > > wrote: > > So this ‘production ready’ CephFS for jewel seems a little not

Re: [ceph-users] failing to respond to cache pressure

2016-05-16 Thread John Spray
On Mon, May 16, 2016 at 5:42 AM, Andrus, Brian Contractor wrote: > So this ‘production ready’ CephFS for jewel seems a little not quite…. > > > > Currently I have a single system mounting CephFS and merely scp-ing data to > it. > > The CephFS mount has 168 TB used, 345 TB / 514 TB avail. > > > > E

Re: [ceph-users] Failing to respond to cache pressure?

2015-05-05 Thread John Spray
On 05/05/2015 18:17, Lincoln Bryant wrote: Hello all, I'm seeing some warnings regarding trimming and cache pressure. We're running 0.94.1 on our cluster, with erasure coding + cache tiering backing our CephFS. health HEALTH_WARN mds0: Behind on trimming (250/30)