Hi Pritha,

At the time, the 'primary' cluster (i.e. the one with the active data
set) was receiving backup files from a small number of machines, prior
to replication being

enabled it was using ~10% RAM on the RadosGW boxes.


Without replication enabled, neither cluster sees any spikes in memory
usage under normal operation, with a slight increase when deep
scrubbing (I'm monitoring

cluster memory usage as a whole so OSD memory increases would account
for that). Neither cluster was performing a deep scrub at the time.
The 'secondary' cluster

(i.e. the one I was trying to sync data to, which now has replication
disabled again) has now had a RadosGW process running under normal
load since June 17

with replication disabled and is using 1084M RSS. This matches with
historical graphing for the primary cluster, which has hovered around
1G RSS for RadosGW

processes for the last 6 months.


I've just tested this out this morning and enabling replication caused
all RadosGW processes to increase in memory usage (and continue
increasing) from ~1000M RSS

to ~20G RSS in about 2 minutes. As soon as replication is enabled (as
in, within seconds) RSS of RadosGW on both clusters starts to increase
and does not drop. This

appears to happen during metadata sync as well as during normal data
syncing as well.


I then killed all RadosGW processes on the 'primary' side, and memory
usage of the RadosGW processes on the 'secondary' side continue to
increase in usage at

the same rate. There are no further messages in the RadosGW log as
this is occurring (since there is no client traffic and no further
replication traffic).

If I kill the active RadosGW processes then they start back up and
normal memory usage resumes.

Cheers,

Ben.


----- Original Message -----
> From: "Pritha Srivastava" <prsrivas@... 
> <http://gmane.org/get-address.php?address=prsrivas%2dH%2bwXaHxf7aLQT0dZR%2bAlfA%40public.gmane.org>>
> To: ceph-users@... 
> <http://gmane.org/get-address.php?address=ceph%2dusers%2didqoXFIVOFJgJs9I8MT0rw%40public.gmane.org>
> Sent: Monday, June 27, 2016 07:32:23
> Subject: Re: [ceph-users] Jewel Multisite RGW Memory Issues

> Do you know if the memory usage is high only during load from clients and is
> steady otherwise?
> What was the nature of the workload at the time of the sync operation?

> Thanks,
> Pritha
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to