On 30 Apr 2014, at 9:01 pm, Greg Murphy <greg.mur...@gamesparks.com> wrote:

> Hi
> 
> I’m running a two-node Pacemaker cluster on Ubuntu Saucy (13.10), kernel 
> 3.11.0-17-generic and the Ubuntu Pacemaker package, version 
> 1.1.10+git20130802-1ubuntu1.

The problem is that I have no way of knowing what code is/isn't included in 
'1.1.10+git20130802-1ubuntu1'.
You could try setting the following in your environment before starting 
pacemaker though

# Variables for running child daemons under valgrind and/or checking for memory 
problems
G_SLICE=always-malloc
MALLOC_PERTURB_=221 # or 0
MALLOC_CHECK_=3     # or 0,1,2
PCMK_valgrind_enabled=lrmd
VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25 
--log-file=/var/lib/pacemaker/valgrind-%p 
--suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions 
--gen-suppressions=all"


> The cluster is configured with a DRBD master/slave set and then a failover 
> resource group containing MySQL (along with its DRBD filesystem) and a Zabbix 
> Proxy and Agent.
> 
> Since I built the cluster around two months ago I’ve noticed that on the the 
> active node the memory footprint of lrmd gradually grows to quite a 
> significant size. The cluster was last restarted three weeks ago, and now 
> lrmd has over 1GB of mapped memory on the active node and only 151MB on the 
> passive node. Current excerpts from /proc/PID/status are:
> 
> Active node
> VmPeak:
> 1146740 kB
> VmSize:
> 1146740 kB
> VmLck:
>       0 kB
> VmPin:
>       0 kB
> VmHWM:
>   267680 kB
> VmRSS:
>   188764 kB
> VmData:
> 1065860 kB
> VmStk:
>     136 kB
> VmExe:
>       32 kB
> VmLib:
>   10416 kB
> VmPTE:
>     2164 kB
> VmSwap:
>   822752 kB
> 
> Passive node
> VmPeak:
>   220832 kB
> VmSize:
>   155428 kB
> VmLck:
>       0 kB
> VmPin:
>       0 kB
> VmHWM:
>     4568 kB
> VmRSS:
>     3880 kB
> VmData:
>   74548 kB
> VmStk:
>     136 kB
> VmExe:
>       32 kB
> VmLib:
>   10416 kB
> VmPTE:
>     172 kB
> VmSwap:
>       0 kB
> 
> During the last week or so I’ve taken a couple of snapshots of 
> /proc/PID/smaps on the active node, and the heap particularly stands out as 
> growing: (I have the full outputs captured if they’ll help)
> 
> 20140422
> 7f92e1578000-7f92f218b000 rw-p 00000000 00:00 0                          
> [heap]
> Size:             274508 kB
> Rss:              180152 kB
> Pss:              180152 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:    180152 kB
> Referenced:       120472 kB
> Anonymous:        180152 kB
> AnonHugePages:         0 kB
> Swap:              91568 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> 
> 20140423
> 7f92e1578000-7f92f305e000 rw-p 00000000 00:00 0                          
> [heap]
> Size:             289688 kB
> Rss:              184136 kB
> Pss:              184136 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:    184136 kB
> Referenced:        69748 kB
> Anonymous:        184136 kB
> AnonHugePages:         0 kB
> Swap:             103112 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> 20140430
> 7f92e1578000-7f92fc01d000 rw-p 00000000 00:00 0                          
> [heap]
> Size:             436884 kB
> Rss:              140812 kB
> Pss:              140812 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:       744 kB
> Private_Dirty:    140068 kB
> Referenced:        43600 kB
> Anonymous:        140812 kB
> AnonHugePages:         0 kB
> Swap:             287392 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> I noticed in the release notes for 1.1.10-rc1 
> (https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.10-rc1) 
> that there was work done to fix "crmd: lrmd: stonithd: fixed memory leaks” 
> but I’m not sure which particular bug this was related to. (And those fixes 
> should be in the version I’m running anyway).
> 
> I’ve also spotted a few memory leak fixes in 
> https://github.com/beekhof/pacemaker, but I’m not sure whether they relate to 
> my issue (assuming I have a memory leak and this isn’t expected behaviour).
> 
> Is there additional debugging that I can perform to check whether I have a 
> leak, or is there enough evidence to justify upgrading to 1.1.11?
> 
> Thanks in advance
> 
> Greg Murphy
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to