On 10/09/2012 01:42 PM, James Harper wrote:
> As per previous post, I'm seeing very high cib load whenever I make a 
> configuration change, enough load that things timeout seemingly instantly. I 
> thought this was happening well before the configured timeout but now I'm not 
> so sure, maybe the timeouts are actually working okay and it just seems 
> instant. If the timeouts are in fact working correctly then it's keeping the 
> CPU at 100% for over 30 seconds to the exclusion of any monitoring checks (or 
> maybe locking the cib so the checks can't run?)
> 
> When I make a change I see the likes of this sort of thing in the logs (see 
> data below email), which I thought might be solved by this 
> https://github.com/ClusterLabs/pacemaker/commit/10e9e579ab032bde3938d7f3e13c414e297ba3e9
>  but i just checked the 1.1.7 source that the Debian packages are built from 
> and it turns out that that patch already exists in 1.1.7.
> 
> Are the messages below actually an indication of a problem? If I understand 
> it correctly it's failing to apply the configuration diff and is instead 
> forcing a full resync of the configuration across some or all nodes, which is 
> causing the high load.
> 
> I ran the crm_report but it includes a lot of information I really need to 
> remove so I'm reluctant to submit it in full unless it really all is required 
> to resolve the problem.
> 

You already did some tuning like increasing batch-limit in your cluster
properties and increased corosync timings? Hard to say more without
getting more information ... if your configuration details are too
sensitive to post on a public mailing-list you can of course hire
someone and give that information under NDA ....

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> Thanks
> 
> James
> 
> Oct  9 21:35:30 bitvs2 cib: [6185]: info: apply_xml_diff: Digest mis-match: 
> expected e7f7aaa1eb10c7a633e94da57dfda2ac, calculated 
> 445109490690d53e024c333fac6ab4c9
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_process_diff: Diff 0.1354.85 
> -> 0.1354.86 not applied to 0.1354.85: Failed application of an update diff
> Oct  9 21:35:30 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting 
> re-sync from peer
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1354.85 -> 0.1354.86 (sync in progress)
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1354.86 -> 0.1354.87 (sync in progress)
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1354.86 -> 0.1354.87 (sync in progress)
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1354.86 -> 0.1354.87 (sync in progress)
> Oct  9 21:35:30 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1354.87 -> 0.1355.1 (sync in progress)
> Oct  9 21:35:30 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1355.1 -> 
> 0.1355.2 not applied to 0.1354.85: current "epoch" is less than required
> Oct  9 21:35:30 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting 
> re-sync from peer
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: apply_xml_diff: Digest mis-match: 
> expected b77fae3dc1e835e0d6a3d1a305d262cb, calculated 
> 120fcac6996ff9f5148f69712fc54689
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_process_diff: Diff 0.1357.7 
> -> 0.1357.8 not applied to 0.1357.7: Failed application of an update diff
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting 
> re-sync from peer
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1357.7 -> 0.1357.8 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1357.8 -> 0.1358.1 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1358.1 -> 0.1358.2 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1358.2 -> 0.1358.3 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1358.3 -> 0.1359.1 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1359.1 -> 
> 0.1359.2 not applied to 0.1357.7: current "epoch" is less than required
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting 
> re-sync from peer
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.2 -> 0.1359.3 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.3 -> 0.1359.4 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.4 -> 0.1359.5 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.5 -> 0.1359.6 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.6 -> 0.1359.7 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: cib_process_diff: Diff 0.1359.7 -> 
> 0.1359.8 not applied to 0.1357.7: current "epoch" is less than required
> Oct  9 21:35:33 bitvs2 cib: [6185]: info: cib_server_process_diff: Requesting 
> re-sync from peer
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.8 -> 0.1359.9 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.9 -> 0.1359.10 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.10 -> 0.1359.11 (sync in progress)
> Oct  9 21:35:33 bitvs2 cib: [6185]: notice: cib_server_process_diff: Not 
> applying diff 0.1359.11 -> 0.1359.12 (sync in progress)
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to