I've been copying happily for days now (not very fast, but the MDS were
stable), but eventually the MDSs started flapping again due to large
cache sizes (they are being killed after 11M inodes). I could solve the
problem by temporarily increasing the cache size in order to allow them
to rejoin,
> Your parallel rsync job is only getting 150 creates per second? What
> was the previous throughput?
I am actually not quite sure what the exact throughput was or is or what
I can expect. It varies so much. I am copying from a 23GB file list that
is split into 3000 chunks which are then process
On Tue, Aug 6, 2019 at 7:57 AM Janek Bevendorff
wrote:
>
>
> > 4k req/s is too fast for a create workload on one MDS. That must
> > include other operations like getattr.
>
> That is rsync going through millions of files checking which ones need
> updating. Right now there are not actually any cre
4k req/s is too fast for a create workload on one MDS. That must
include other operations like getattr.
That is rsync going through millions of files checking which ones need
updating. Right now there are not actually any create operations, since
I restarted the copy job.
I wouldn't expec
On Tue, Aug 6, 2019 at 12:48 AM Janek Bevendorff
wrote:
> > However, now my client processes are basically in constant I/O wait
> > state and the CephFS is slow for everybody. After I restarted the copy
> > job, I got around 4k reqs/s and then it went down to 100 reqs/s with
> > everybody waiting
However, now my client processes are basically in constant I/O wait
state and the CephFS is slow for everybody. After I restarted the copy
job, I got around 4k reqs/s and then it went down to 100 reqs/s with
everybody waiting their turn. So yes, it does seem to help, but it
increases latency
Thanks that helps. Looks like the problem is that the MDS is not
automatically trimming its cache fast enough. Please try bumping
mds_cache_trim_threshold:
bin/ceph config set mds mds_cache_trim_threshold 512K
That did help. Somewhat. I removed the aggressive recall settings I set
before an
On Mon, Aug 5, 2019 at 12:21 AM Janek Bevendorff
wrote:
>
> Hi,
>
> > You can also try increasing the aggressiveness of the MDS recall but
> > I'm surprised it's still a problem with the settings I gave you:
> >
> > ceph config set mds mds_recall_max_caps 15000
> > ceph config set mds mds_recall_m
Hi,
You can also try increasing the aggressiveness of the MDS recall but
I'm surprised it's still a problem with the settings I gave you:
ceph config set mds mds_recall_max_caps 15000
ceph config set mds mds_recall_max_decay_rate 0.75
I finally had the chance to try the more aggressive recall
I am not sure if making caps recall more aggressive helps. It seems to be the client failing to respond to it (at least that's what the warnings say).But I will try your new suggested settings as soon as I get the chance and will report back with the results. On 25 Jul 2019 11:00 pm, Patrick Donnel
On Thu, Jul 25, 2019 at 12:49 PM Janek Bevendorff
wrote:
>
>
> > Based on that message, it would appear you still have an inode limit
> > in place ("mds_cache_size"). Please unset that config option. Your
> > mds_cache_memory_limit is apparently ~19GB.
>
> No, I do not have an inode limit set. Onl
> Based on that message, it would appear you still have an inode limit
> in place ("mds_cache_size"). Please unset that config option. Your
> mds_cache_memory_limit is apparently ~19GB.
No, I do not have an inode limit set. Only the memory limit.
> There is another limit mds_max_caps_per_clien
On Thu, Jul 25, 2019 at 3:08 AM Janek Bevendorff
wrote:
>
> The rsync job has been copying quite happily for two hours now. The good
> news is that the cache size isn't increasing unboundedly with each
> request anymore. The bad news is that it still is increasing afterall,
> though much slower. I
It's possible the MDS is not being aggressive enough with asking the
single (?) client to reduce its cache size. There were recent changes
[1] to the MDS to improve this. However, the defaults may not be
aggressive enough for your client's workload. Can you try:
ceph config set mds mds_recall_
+ other ceph-users
On Wed, Jul 24, 2019 at 10:26 AM Janek Bevendorff
wrote:
>
> > what's the ceph.com mailing list? I wondered whether this list is dead but
> > it's the list announced on the official ceph.com homepage, isn't it?
> There are two mailing lists announced on the website. If you go
15 matches
Mail list logo