Hi, Andrew When the disk utilization of the DC node became 100%, I found the phenomenon that memory was used in large quantities by pengine.
When pengine fails in the output of the pe-input file, this memory consumption seems to happen. When failed, the following log is output. Sep 1 14:15:50 sby2 pengine: [3156]: ERROR: write_xml_file: bzWriteClose() failed: -6 As a result of valgrind, I do not seem to release the memory which I acquired in libbz2. (1) ==1606== 4,384,028 (10,208 direct, 4,373,820 indirect) bytes in 2 blocks are definitely lost in loss record 104 of 109 (1) ==1606== at 0x4A05FDE: malloc (vg_replace_malloc.c:236) (1) ==1606== by 0x37E960B972: BZ2_bzWriteOpen (in /lib64/libbz2.so.1.0.4) (1) ==1606== by 0x4E584C9: write_xml_file (xml.c:744) (1) ==1606== by 0x52A7818: process_pe_message (pengine.c:191) (1) ==1606== by 0x4012DF: pe_msg_callback (main.c:60) (1) ==1606== by 0x59127A9: G_CH_dispatch_int (in /usr/lib64/libplumb.so.2.1.0) (1) ==1606== by 0x37D5E38F0D: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x37D5E3C937: ??? (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x37D5E3CD54: g_main_loop_run (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x401929: main (main.c:177) [snip] (1) ==1606== 22,322,828 (446,144 direct, 21,876,684 indirect) bytes in 8 blocks are definitely lost in loss record 109 of 109 (1) ==1606== at 0x4A05FDE: malloc (vg_replace_malloc.c:236) (1) ==1606== by 0x37E960AF4A: BZ2_bzCompressInit (in /lib64/libbz2.so.1.0.4) (1) ==1606== by 0x37E960B9F1: BZ2_bzWriteOpen (in /lib64/libbz2.so.1.0.4) (1) ==1606== by 0x4E584C9: write_xml_file (xml.c:744) (1) ==1606== by 0x52A7818: process_pe_message (pengine.c:191) (1) ==1606== by 0x4012DF: pe_msg_callback (main.c:60) (1) ==1606== by 0x59127A9: G_CH_dispatch_int (in /usr/lib64/libplumb.so.2.1.0) (1) ==1606== by 0x37D5E38F0D: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x37D5E3C937: ??? (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x37D5E3CD54: g_main_loop_run (in /lib64/libglib-2.0.so.0.2200.5) (1) ==1606== by 0x401929: main (main.c:177) [snip] (1) ==1606== LEAK SUMMARY: (1) ==1606== definitely lost: 456,352 bytes in 10 blocks (1) ==1606== indirectly lost: 26,250,504 bytes in 30 blocks (1) ==1606== possibly lost: 20,977,780 bytes in 215 blocks (1) ==1606== still reachable: 7,237 bytes in 43 blocks (1) ==1606== suppressed: 0 bytes in 0 blocks (1) ==1606== Reachable blocks (those to which a pointer was found) are not shown. (1) ==1606== To see them, rerun with: --leak-check=full --show-reachable=yes (1) ==1606== (1) ==1606== For counts of detected and suppressed errors, rerun with: -v (1) ==1606== Use --track-origins=yes to see where uninitialised values come from (1) ==1606== ERROR SUMMARY: 134454 errors from 95 contexts (suppressed: 6 from 6) Best Regards, Yuusuke -- ---------------------------------------- METRO SYSTEMS CO., LTD Yuusuke Iida Mail: iiday...@intellilink.co.jp ---------------------------------------- _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker