[Linux-HA] Does ocf:Filesystem work with ext4

2009-12-07 Thread Dinh N. Quoc
Hello,

I don't know whether the current ocf:Filesystem supports EXT4? Does
anyone who knows of this, please advice?

Many thanks,
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Does ocf:Filesystem work with ext4

2009-12-07 Thread Darren.Mansell
Looks like it just tries to mount whatever you put in there:

case "$FSTYPE" in
none) $MOUNT $options $DEVICE $MOUNTPOINT &&
bind_mount
;;
"") $MOUNT $options $DEVICE $MOUNTPOINT ;;
*) $MOUNT -t $FSTYPE $options $DEVICE $MOUNTPOINT ;;
Esac


-Original Message-
From: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Dinh N. Quoc
Sent: 07 December 2009 10:16
To: linux-ha@lists.linux-ha.org
Subject: [Linux-HA] Does ocf:Filesystem work with ext4

Hello,

I don't know whether the current ocf:Filesystem supports EXT4? Does
anyone who knows of this, please advice?

Many thanks,
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Strange issues with IPaddr & IPaddr2

2009-12-07 Thread Marian Marinov
I have filed a bug report about this issue:

http://developerbugs.linux-foundation.org/show_bug.cgi?id=2242


On Monday 07 December 2009 06:18:13 Marian Marinov wrote:
> Hello,
> I'm just setting up a new HA solution and decided to go with the latest
> peacemaker. However I have encountered some very strange problems:
> 
> crm(live)# cib use t1
> crm(t1)# configure
> crm(t1)configure# property stonith-enabled=false
> crm(t1)configure# show
> node $id="0f519a50-98e0-4bc2-8fb9-6bc28d87461c" fiona
> node $id="57db28bf-a58e-4b98-a9a9-03e65bce2431" shrek
> property $id="cib-bootstrap-options" \
> dc-version="1.0.6-f709c638237cdff7556cb6ab615f32826c0f8c06" \
> cluster-infrastructure="Heartbeat" \
> stonith-enabled="false"
> crm(t1)configure# primitive failover-ip ocf:heartbeat:IPaddr params
>  ip=10.3.0.1 op monitor interval=10s
> ERROR: failover-ip: action monitor_0 does not exist
> crm(t1)configure# primitive failover-ip ocf:heartbeat:IPaddr2 params
> ip=10.3.0.1 cidr_netmask=32 op monitor interval=10s
> ERROR: failover-ip: action monitor_0 does not exist
> crm(t1)configure# primitive failover-ip ocf:heartbeat:IPaddr2 params
> ip=10.3.0.1 cidr_netmask=32 interval=10s
> ERROR: failover-ip: parameter interval does not exist
> crm(t1)configure# primitive failover-ip ocf:heartbeat:IPaddr2 params
> ip=10.3.0.1 cidr_netmask=32
> 
> 
> What I can't understand is where did the monitor operation disappear?
> 
> I'm currently using similar setup on one year old setup without any issues.
> What am I missing.
> 
> The setup is:
> CentOS 5.4
> Heartbeat:Version : 3.0.1 Release : 1.el5
> Pacemaker:Version : 1.0.6 Release : 1.el5
> 
> I'm not using openais.
> 

-- 
Best regards,
Marian Marinov


signature.asc
Description: This is a digitally signed message part.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Metadata not found?

2009-12-07 Thread Andrew Beekhof
On Sat, Dec 5, 2009 at 11:48 AM, Alexander Födisch  wrote:
> Hi,
>
> we have a very strange problem. On a cluster of three nodes we can start one 
> of the resources only on one node, but on
> the other both nodes the start always fails:
>
>
> Dec 05 09:40:14  lrmd: [4097]: WARN: on_msg_get_metadata: empty 
> metadata for ocf::heartbeat::samba-ha.
> Dec 05 09:40:14  crmd: [4100]: ERROR: 
> lrm_get_rsc_type_metadata(575): got a return code HA_FAIL from a
> reply message of rmetadata with function get_ret_from_msg.
> Dec 05 09:40:14  crmd: [4100]: WARN: get_rsc_metadata: No 
> metadata found for samba-ha::ocf:heartbeat
>
>
> crm_verify -LV -x /var/lib/heartbeat/crm/cib.xml
> crm_verify[20967]: 2009/12/05_11:37:10 WARN: unpack_rsc_op: Processing failed 
> op samba-ha_videofs_monitor_2 on
> : unknown error
> crm_verify[20967]: 2009/12/05_11:37:10 WARN: unpack_rsc_op: Processing failed 
> op samba-ha_videofs_monitor_2 on
> : unknown error
>
>
>
> cib.xml snip for this resource:
> [...]
>         id="samba-ha_videofs">
>            
>              
>              
>               timeout="10s"/>
>            
>            
>              
>                 name="SAMBAHOST"/>
>              
>            
>          
> [...]
>
>
>
> What metadata is heartbeat looking for and why it is working on one node?

http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/ra-metadata-example.xml?rev=1.4&content-type=text/vnd.viewcvs-markup

Sounds like your agent isnt OCF compliant.  Try running it through ocf-tester.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] SLES 11 which version of heartbeat

2009-12-07 Thread Werner Flamme
Jochen Lienhard [03.12.2009 09:10]:
> Hi Tim,
> 
> do special reasons exist for using SLE HA?

In SLES 10, all HA software was within the normal support. In SLES 11,
the HA packages are supported with a special support contract.

> Are there some problems with linux-ha and SLES 11?

Should not be the case, since SLES 11 is almost like 11.1, and HA seems
to run here. But since SAP does not support SLES 11, I don't have a
running system with it.

> How is the support for SLE HA? (same as SLES 11)?

Yes, but it costs more money :-\ You can upgrade the standard support
with an HA support package. Ask your accountant at Novell...

Regards,
Werner
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Primary node is not releasing the resource in the case of failure ( In the NFS-ha)

2009-12-07 Thread Rajkumar Agrawal
Hi Phil Bayfiled,

We are not un-mount the nfs because i believe heartbeat is take care to 
start and stop the nfs service. For testing the failover, we stop the 
heartbeat at primary node, but secondary node is not able to take the 
resource and we are getting the below error :

Filesystem[7192]:   2009/12/04_16:22:20 ERROR: Couldn't unmount 
/dvshare; trying cleanup with SIGKILL
Filesystem[7192]:   2009/12/04_16:22:20 INFO: No processes on 
/dvshare were signalled
Filesystem[7192]:   2009/12/04_16:22:21 ERROR: Couldn't unmount 
/dvshare, giving up!
Filesystem[7181]:   2009/12/04_16:22:21 ERROR:  Generic error
ResourceManager[5157]:  2009/12/04_16:22:21 ERROR: Return code 1 from 
/etc/ha.d/resource.d/Filesystem
Filesystem[7353]:   2009/12/04_16:22:21 INFO:  Running OK
ResourceManager[5157]:  2009/12/04_16:22:21 CRIT: Resource STOP failure. 
Reboot required!
ResourceManager[5157]:  2009/12/04_16:22:21 CRIT: Killing heartbeat 
ungracefully!


Phil Bayfield wrote:
> Are you stopping the NFS server before trying to unmount?
> If the resource is busy heartbeat will not be able to unmount it.
>
> Rajkumar Agrawal wrote:
>   
>> Hi,
>> We installed the NFS-ha for the high availability of NFS server. For the 
>> testing, when we stop the heartbeat service at primary node, primary 
>> node is not releasing the resource for the secondary. We get this from 
>> ha-log. So secondary node is not able to take over the resource. The 
>> /var/log/ha-log of primary node are :
>>
>> Filesystem[7192]:   2009/12/04_16:22:20 ERROR: Couldn't unmount 
>> /dvshare; trying cleanup with SIGKILL
>> Filesystem[7192]:   2009/12/04_16:22:20 INFO: No processes on 
>> /dvshare were signalled
>> Filesystem[7192]:   2009/12/04_16:22:21 ERROR: Couldn't unmount 
>> /dvshare, giving up!
>> Filesystem[7181]:   2009/12/04_16:22:21 ERROR:  Generic error
>> ResourceManager[5157]:  2009/12/04_16:22:21 ERROR: Return code 1 from 
>> /etc/ha.d/resource.d/Filesystem
>> Filesystem[7353]:   2009/12/04_16:22:21 INFO:  Running OK
>> ResourceManager[5157]:  2009/12/04_16:22:21 CRIT: Resource STOP failure. 
>> Reboot required!
>> ResourceManager[5157]:  2009/12/04_16:22:21 CRIT: Killing heartbeat 
>> ungracefully!
>>
>>
>> Plz help us to troubleshoot this.
>>
>> Thanks
>> Rajkumar Agrawal
>>
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>   
>> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>   


-- 
Rajkumar Agrawal. Systems Administrator Deep Value Technology Pvt Ltd
+1 646 651 4686 x122 ? +91 44 42630403 x26 www.deepvalue.net ? 90 Anna Salai 
Chennai 600 002 India

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Xen live migration and constraints - hb_report

2009-12-07 Thread infernix

Hi,

Apologies for my previous mail, I should have been clearer. I've got a 
full hb_report now (attached or at 
http://dx.infernix.net/xen_order.tar.bz2). I'll keep things short and to 
the point.


I've got a two node heartbeat+pacemaker cluster, using DRBD for shared 
storage. I have 7 ocf:heartbeat:Xen resources, all configured similar to 
this:


primitive base ocf:heartbeat:Xen \
meta target-role="started" is-managed="true" allow-migrate="1" \
operations $id="base-operations" \
op monitor interval="10" \
op start interval="0" timeout="45" \
op stop interval="0" timeout="300" \
op migrate_from interval="0" timeout="240" \
op migrate_to interval="0" timeout="240" \
params xmfile="/etc/xen/base.cfg" name="base"

The Xen configs use drbd:resource style disks. drbd is configured as per 
the Xen pages in its documentation. Live migration works fine when 
migrating manually (in hb_gui). Without an order constraint, it works 
fine when when putting a node in standby, but it will migrate them all 
in parallel.


So I searched the mailinglists and found that this is the order that 
should be used to prevent xen guests starting in parallel (only "start X 
before Y", nothing else):


order db_before_dbreplica 0: db dbreplica symmetrical=false
order dbreplica_before_core-101 0: dbreplica core-101 symmetrical=false
order core-101_before_core-200 0: core-101 core-200 symmetrical=false
order core-200_before_sysadmin 0: core-200 sysadmin symmetrical=false
order sysadmin_before_edge 0: sysadmin edge symmetrical=false
order edge_before_base 0: edge base symmetrical=false

So I addded these orders. At the start of the log, all guests are on 
xen-a; both nodes are active. Then I put xen-a on standby, and this is 
what happens:


Dec  8 03:43:21 xen-b pengine: [32628]: notice: LogActions: Leave 
resource xen-a-fencing#011(Started xen-b)

notice: LogActions: Stop resource xen-b-fencing#011(xen-a)
notice: check_stack_element: Cannot migrate base due to dependancy on 
edge (order)

notice: LogActions: Move resource base#011(Started xen-a -> xen-b)
notice: check_stack_element: Cannot migrate core-101 due to dependancy 
on dbreplica (order)

 notice: LogActions: Move resource core-101#011(Started xen-a -> xen-b)
notice: check_stack_element: Cannot migrate core-200 due to dependancy 
on core-101 (order)

notice: LogActions: Move resource core-200#011(Started xen-a -> xen-b)
info: complex_migrate_reload: Migrating db from xen-a to xen-b
notice: LogActions: Migrate resource db#011(Started xen-a -> xen-b)
notice: check_stack_element: Cannot migrate sysadmin due to dependancy 
on core-200 (order)

notice: LogActions: Move resource sysadmin#011(Started xen-a -> xen-b)
notice: check_stack_element: Cannot migrate edge due to dependancy on 
sysadmin (order)

notice: LogActions: Move resource edge#011(Started xen-a -> xen-b)
notice: check_stack_element: Cannot migrate dbreplica due to dependancy 
on db (order)

notice: LogActions: Move resource dbreplica#011(Started xen-a -> xen-b)
notice: LogActions: Move resource Email_Alerting#011(Started xen-a -> xen-b)


With this order active, the only Xen guest that gets live migrated is 
db. All the others are stopped and then started.


Two questions:

- How can I make them migrate one after another instead of stopping and 
starting?


- And how can I then still keep them from starting in parallel with a 
clean boot (e.g. both nodes freshly booted after a power outage)?


Any help would be appreciated.

Thanks!



xen_order.tar.bz2
Description: application/bzip
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems