On 10/09/2012 01:42 PM, James Harper wrote: > As per previous post, I'm seeing very high cib load whenever I make a > configuration change, enough load that things timeout seemingly instantly. I > thought this was happening well before the configured timeout but now I'm not > so sure, maybe the timeouts are actually working okay and it just seems > instant. If the timeouts are in fact working correctly then it's keeping the > CPU at 100% for over 30 seconds to the exclusion of any monitoring checks (or > maybe locking the cib so the checks can't run?) > > When I make a change I see the likes of this sort of thing in the logs (see > data below email), which I thought might be solved by this > https://github.com/ClusterLabs/pacemaker/commit/10e9e579ab032bde3938d7f3e13c414e297ba3e9 > but i just checked the 1.1.7 source that the Debian packages are built from > and it turns out that that patch already exists in 1.1.7. > > Are the messages below actually an indication of a problem? If I understand > it correctly it's failing to apply the configuration diff and is instead > forcing a full resync of the configuration across some or all nodes, which is > causing the high load. > > I ran the crm_report but it includes a lot of information I really need to > remove so I'm reluctant to submit it in full unless it really all is required > to resolve the problem. >
You already did some tuning like increasing batch-limit in your cluster properties and increased corosync timings? Hard to say more without getting more information ... if your configuration details are too sensitive to post on a public mailing-list you can of course hire someone and give that information under NDA .... Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > Thanks > > James > > Oct 9 21:35:30 bitvs2 cib: [6185]: info: apply_xml_diff: Digest mis-match: > expected e7f7aaa1eb10c7a633e94da57dfda2ac, calculated > 445109490690d53e024c333fac6ab4c9 > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_process_diff: Diff 0.1354.85 > -> 0.1354.86 not applied to 0.1354.85: Failed application of an update diff > Oct 9 21:35:30 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting > re-sync from peer > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1354.85 -> 0.1354.86 (sync in progress) > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1354.86 -> 0.1354.87 (sync in progress) > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1354.86 -> 0.1354.87 (sync in progress) > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1354.86 -> 0.1354.87 (sync in progress) > Oct 9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1354.87 -> 0.1355.1 (sync in progress) > Oct 9 21:35:30 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1355.1 -> > 0.1355.2 not applied to 0.1354.85: current "epoch" is less than required > Oct 9 21:35:30 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting > re-sync from peer > Oct 9 21:35:33 bitvs2 cib: [6185]: info: apply_xml_diff: Digest mis-match: > expected b77fae3dc1e835e0d6a3d1a305d262cb, calculated > 120fcac6996ff9f5148f69712fc54689 > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_process_diff: Diff 0.1357.7 > -> 0.1357.8 not applied to 0.1357.7: Failed application of an update diff > Oct 9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting > re-sync from peer > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1357.7 -> 0.1357.8 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1357.8 -> 0.1358.1 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1358.1 -> 0.1358.2 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1358.2 -> 0.1358.3 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1358.3 -> 0.1359.1 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1359.1 -> > 0.1359.2 not applied to 0.1357.7: current "epoch" is less than required > Oct 9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting > re-sync from peer > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.2 -> 0.1359.3 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.3 -> 0.1359.4 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.4 -> 0.1359.5 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.5 -> 0.1359.6 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.6 -> 0.1359.7 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1359.7 -> > 0.1359.8 not applied to 0.1357.7: current "epoch" is less than required > Oct 9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting > re-sync from peer > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.8 -> 0.1359.9 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.9 -> 0.1359.10 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.10 -> 0.1359.11 (sync in progress) > Oct 9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not > applying diff 0.1359.11 -> 0.1359.12 (sync in progress) > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org