On Thu, Jan 7, 2010 at 2:31 AM, Thomas Guthmann <tguthm...@iseek.com.au> wrote: > Re, > >> I've just had 2 orphans (tom-DNS:2,tom-DNS:3) for a clone containing 2 >> groups (tom-DNS:0, tom-DNS:1). Anyway the situation is far better than >> before, no more craziness during cleanups. I will dig and test more >> tomorrow and give you an update to see if I can reproduce the issue. > > 1. I was wondering if my always 2 orphans and no more (displayed by hb_gui) > that appears sometimes after the clone creation are linked to the fact that > I have 4 nodes in the cluster. Indeed we are using an asymmetrical cluster > and only 2 nodes can be a DNS servers so the 2 others can't. I was wondering > if the 2 "orphans" could be the 2 other ones.
Yep, that would do it. Since we'd probe on all four hosts. Can you log a bug for that please? > I found that in the DC logs : > > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: unpack_config: Node scores: > 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: determine_online_status: > Node tom-lb2.clone.wars.com.au is online > Jan 7 09:38:19 tom-dns1 pengine: [3227]: WARN: process_orphan_resource: > Nothing known about resource tom-lo-vip-ns1 running on > tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="tom-lo-vip-ns1" > type="IPaddr" class="ocf" provider="heartbeat" /> > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: process_orphan_resource: > Making sure orphan tom-lo-vip-ns1 is stopped > Jan 7 09:38:19 tom-dns1 pengine: [3227]: WARN: process_orphan_resource: > Nothing known about resource tom-lo-vip-ns2 running on > tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="tom-lo-vip-ns2" > type="IPaddr" class="ocf" provider="heartbeat" /> > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: process_orphan_resource: > Making sure orphan tom-lo-vip-ns2 is stopped > Jan 7 09:38:19 tom-dns1 pengine: [3227]: WARN: process_orphan_resource: > Nothing known about resource tom-lo-vip-dns1 running on > tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="tom-lo-vip-dns1" > type="IPaddr" class="ocf" provider="heartbeat" /> > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: process_orphan_resource: > Making sure orphan tom-lo-vip-dns1 is stopped > Jan 7 09:38:19 tom-dns1 pengine: [3227]: WARN: process_orphan_resource: > Nothing known about resource tom-lo-vip-dns2 running on > tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="tom-lo-vip-dns2" > type="IPaddr" class="ocf" provider="heartbeat" /> > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: process_orphan_resource: > Making sure orphan tom-lo-vip-dns2 is stopped > Jan 7 09:38:19 tom-dns1 pengine: [3227]: notice: unpack_rsc_op: Hard error > - tom-named:0_monitor_0 failed with rc=5: Preventing tom-DNS-clone from > re-starting on tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: WARN: process_orphan_resource: > Nothing known about resource tom-named running on tom-lb2.clone.wars.com.au > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: log_data_element: > create_fake_resource: Orphan resource <primitive id="tom-named" type="named" > class="ocf" provider="iseek" /> > Jan 7 09:38:19 tom-dns1 pengine: [3227]: info: process_orphan_resource: > Making sure orphan tom-named is stopped > > Is it possible that they are the one displayed by hb_gui ? Is there a text > way to see that ? crm_mon -1 doesn't display them and show me only that. So > do these orphans really exists ? Do you need the CIB to have a look ? > > Clone Set: tom-DNS-clone > Started: [ tom-dns2.clone.wars.com.au tom-dns1.clone.wars.com.au ] > > > 2. Is it a normal behaviour to still have these when you have just deleted a > clone ? Extract from the HTML output of crm_mon (see 3. as well) > (Partially) Inactive Resources > tom-lo-vip-ns2:0 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-lo-vip-dns1:0 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-named:0 (ocf::iseek:named): ORPHANED Stopped > tom-lo-vip-dns2:0 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-lo-vip-ns1:0 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-named:1 (ocf::iseek:named): ORPHANED Stopped > tom-lo-vip-dns2:1 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-lo-vip-ns1:1 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-lo-vip-ns2:1 (ocf::heartbeat:IPaddr): ORPHANED Stopped > tom-lo-vip-dns1:1 (ocf::heartbeat:IPaddr): ORPHANED Stopped > > > 3. Weird issue, definitely a small bug, "crm_mon -r -n -1" and "crm_mon -r > -n -h /tmp/tom.html" doesn't exactly give the same results. In the HTML > version I can see the orphans (written in red) and in console mode I have > nothing except the title 'Inactive resources:' (note it's a different title > than the HTML version, there is no 'partially' word - see 2. above for the > HTML version) > > Sorry guys to bother you with these issues but we'd like to master and to > understand pacemaker well before going in production. It's a nice product > but need a bit of practice to ride it :) > > Cheers, > Thomas > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker