Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Andrew Beekhof Mon, 01 Jul 2013 04:32:34 -0700

On 01/07/2013, at 5:17 PM, Florian Crouzat <gen...@floriancrouzat.net> wrote:


> Le 29/06/2013 01:22, Andrew Beekhof a écrit :
>> 
>> On 29/06/2013, at 12:22 AM, Digimer <li...@alteeve.ca> wrote:
>> 
>>> On 06/28/2013 06:21 AM, Andrew Beekhof wrote:
>>>> 
>>>> On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree <l...@suse.com> wrote:
>>>> 
>>>>> On 2013-06-27T12:53:01, Digimer <li...@alteeve.ca> wrote:
>>>>> 
>>>>>> primitive fence_n01_psu1_off stonith:fence_apc_snmp \
>>>>>>       params ipaddr="an-p01" pcmk_reboot_action="off" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>>> primitive fence_n01_psu1_on stonith:fence_apc_snmp \
>>>>>>       params ipaddr="an-p01" pcmk_reboot_action="on" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>> 
>>>>> So every device twice, including location constraints? I see potential
>>>>> for optimization by improving how the fence code handles this ... That's
>>>>> abhorrently complex. (And I'm not sure the 'action' parameter ought to
>>>>> be overwritten.)
>>>> 
>>>> I'm not crazy about it either because it means the device is tied to a 
>>>> specific command.
>>>> But it seems to be something all the RHCS people try to do...
>>> 
>>> Maybe something in the rhcs water cooler made us all mad... ;)
>>> 
>>>>> Glad you got it working, though.
>>>>> 
>>>>>> location loc_fence_n01_ipmi fence_n01_ipmi -inf: an-c03n01.alteeve.ca
>>>>> [...]
>>>>> 
>>>>> I'm not sure you need any of these location constraints, by the way. Did
>>>>> you test if it works without them?
>>>>> 
>>>>>> Again, this is after just one test. I will want to test it several more
>>>>>> times before I consider it reliable. Ideally, I would love to hear
>>>>>> Andrew or others confirm this looks sane/correct.
>>>>> 
>>>>> It looks correct, but not quite sane. ;-) That seems not to be
>>>>> something you can address, though. I'm thinking that fencing topology
>>>>> should be smart enough to, if multiple fencing devices are specified, to
>>>>> know how to expand them to "first all off (if off fails anywhere, it's a
>>>>> failure), then all on (if on fails, it is not a failure)". That'd
>>>>> greatly simplify the syntax.
>>>> 
>>>> The RH agents have apparently already been updated to support multiple 
>>>> ports.
>>>> I'm really not keen on having the stonith-ng doing this.
>>> 
>>> This doesn't help people who have dual power rails/PDUs for power
>>> redundancy.
>> 
>> I'm yet to be convinced that having two PDUs is helping those people in the 
>> first place.
>> If it were actually useful, I suspect more than two/three people would have 
>> asked for it in the last decade.
> 
> Well, it's probably because many people are still toying around with 
> pacemaker and I assume that not many advanced RHCS users have yet tried to 
> translate their current RHCS cluster to pacemaker. Digimer and I did, and we 
> both failed having the equivalent <device> configuration we had in our RHCS 
> setup.

Yes, but RHEL isn't the only Enterprise distro out there.
Its not like Pacemaker has never been deployed in critical environments during 
the last decade.

German Air Traffic Control (http://www.novell.com/success/dfs.html) for example.
Will planes fall out of the sky if your cluster fails?

> 
> I suspect more and more people will hit this issue soon or later.
> 
> Anyway, whatever will follow in terms of configuration primitive or API, 
> thanks to Digimer tests we now have something (even if unelegant) working :)
> 
> 
> -- 
> Cheers,
> Florian Crouzat
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Reply via email to