On 07/08/2013, at 5:42 PM, Thomas Glanzmann <[email protected]> wrote:

> Hello Andrew,
> 
>> I can try and fix that if you re-run with -x and paste the output.
> 
> (apache-03) [~] crm_report -l /var/adm/syslog/2013/08/05 -f "2013-08-04 
> 18:30:00" -t "2013-08-04 19:15" -x
> + shift
> + true
> + [ ! -z ]
> + break
> + [ x != x ]
> + [ x1375633800 != x ]
> + masterlog=
> + [ -z  ]
> + log WARNING: The tarball produced by this program may contain
> + printf %-10s  WARNING: The tarball produced by this program may contain\n 
> apache-03:
> apache-03:  WARNING: The tarball produced by this program may contain
> + log          sensitive information such as passwords.
> + printf %-10s           sensitive information such as passwords.\n apache-03:
> apache-03:           sensitive information such as passwords.
> + log
> + printf %-10s  \n apache-03:
> apache-03:
> + log We will attempt to remove such information if you use the
> + printf %-10s  We will attempt to remove such information if you use the\n 
> apache-03:
> apache-03:  We will attempt to remove such information if you use the
> + log -p option. For example: -p "pass.*" -p "user.*"
> + printf %-10s  -p option. For example: -p "pass.*" -p "user.*"\n apache-03:
> apache-03:  -p option. For example: -p "pass.*" -p "user.*"
> + log
> + printf %-10s  \n apache-03:
> apache-03:
> + log However, doing this may reduce the ability for the recipients
> + printf %-10s  However, doing this may reduce the ability for the 
> recipients\n apache-03:
> apache-03:  However, doing this may reduce the ability for the recipients
> + log to diagnose issues and generally provide assistance.
> + printf %-10s  to diagnose issues and generally provide assistance.\n 
> apache-03:
> apache-03:  to diagnose issues and generally provide assistance.
> + log
> + printf %-10s  \n apache-03:
> apache-03:
> + log IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM EXPOSURE
> + printf %-10s  IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM 
> EXPOSURE\n apache-03:
> apache-03:  IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM EXPOSURE
> + log
> + printf %-10s  \n apache-03:
> apache-03:
> + [ -z  ]
> + getnodes any
> + [ -z any ]
> + cluster=any
> + [ -z ]
> + HA_STATE_DIR=/var/lib/heartbeat
> + find_cluster_cf any
> + warning Unknown cluster type: any
> + log WARN: Unknown cluster type: any
> + printf %-10s  WARN: Unknown cluster type: any\n apache-03:
> apache-03:  WARN: Unknown cluster type: any
> + cluster_cf=
> + ps -ef
> + egrep -qs [c]ib
> + debug Querying CIB for nodes
> + [ 0 -gt 0 ]
> + cibadmin -Ql -o nodes
> + awk
>          /type="normal"/ {
>                for( i=1; i<=NF; i++ )
>                        if( $i~/^uname=/ ) {
>                                sub("uname=.","",$i);
>                                sub("\".*","",$i);
>                                print $i;
>                                next;
>                        }
>          }
> 
> + tr \n
> + nodes=apache-03 apache-04
> + log Calculated node list: apache-03 apache-04
> + printf %-10s  Calculated node list: apache-03 apache-04 \n apache-03:
> apache-03:  Calculated node list: apache-03 apache-04
> + [ -z apache-03 apache-04  ]
> + echo apache-03 apache-04
> + grep -qs apache-03
> + debug We are a cluster node
> + [ 0 -gt 0 ]
> + [ -z 1375636500 ]
> + date +%a-%d-%b-%Y
> + label=pcmk-Wed-07-Aug-2013
> + time2str 1375633800
> + perl -e use POSIX; print strftime('%x %X',localtime(1375633800));
> + time2str 1375636500
> + perl -e use POSIX; print strftime('%x %X',localtime(1375636500));
> + log Collecting data from apache-03 apache-04  (08/04/13 18:30:00 to 
> 08/04/13 19:15:00)
> + printf %-10s  Collecting data from apache-03 apache-04  (08/04/13 18:30:00 
> to 08/04/13 19:15:00)\n apache-03:
> apache-03:  Collecting data from apache-03 apache-04  (08/04/13 18:30:00 to 
> 08/04/13 19:15:00)
> + collect_data pcmk-Wed-07-Aug-2013 1375633800 1375636500
> + label=pcmk-Wed-07-Aug-2013
> + expr 1375633800 - 10
> + start=1375633790
> + expr 1375636500 + 10
> + end=1375636510
> + masterlog=
> + [ x != x ]
> + l_base=/home/tg/pcmk-Wed-07-Aug-2013
> + r_base=pcmk-Wed-07-Aug-2013
> + [ -e /home/tg/pcmk-Wed-07-Aug-2013 ]
> + mkdir -p /home/tg/pcmk-Wed-07-Aug-2013
> + [ x != x ]
> + cat
> + [ apache-03 = apache-03 ]
> + cat
> + cat /home/tg/pcmk-Wed-07-Aug-2013/.env /usr/share/pacemaker/report.common 
> /usr/share/pacemaker/report.collector
> + bash /home/tg/pcmk-Wed-07-Aug-2013/collector
> apache-03:  ERROR: Could not determine the location of your cluster logs, try 
> specifying --logfile /some/path
> + cat
> + [ apache-03 = apache-04 ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/.env /usr/share/pacemaker/report.common 
> /usr/share/pacemaker/report.collector
> + ssh+  -l root -T apache-04 -- mkdir -p pcmk-Wed-07-Aug-2013; cat > 
> pcmk-Wed-07-Aug-2013/collector; bash pcmk-Wed-07-Aug-2013/collectorcd
> /home/tg/pcmk-Wed-07-Aug-2013
> + tar xf -
> apache-04:  ERROR: Could not determine the location of your cluster logs, try 
> specifying --logfile /some/path
> tar: This does not look like a tar archive
> tar: Exiting with failure status due to previous errors
> + analyze /home/tg/pcmk-Wed-07-Aug-2013
> + flist=hostcache members.txt cib.xml crm_mon.txt  logd.cf sysinfo.txt
> + printf Diff hostcache...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/hostcache
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/hostcache :/
> + continue
> + printf Diff members.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/members.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/members.txt :/
> + continue
> + printf Diff cib.xml...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/cib.xml
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/cib.xml :/
> + continue
> + printf Diff crm_mon.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/crm_mon.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/crm_mon.txt :/
> + continue
> + printf Diff logd.cf...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/logd.cf
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/logd.cf :/
> + continue
> + printf Diff sysinfo.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/sysinfo.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/sysinfo.txt :/
> + continue
> + [ -f /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/apache-03/analysis.txt
> cat: /home/tg/pcmk-Wed-07-Aug-2013/apache-03/analysis.txt: No such file or 
> directory
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/apache-03/events.txt ]
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/apache-04/analysis.txt
> cat: /home/tg/pcmk-Wed-07-Aug-2013/apache-04/analysis.txt: No such file or 
> directory
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/apache-04/events.txt ]
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + log
> + printf %-10s   \n apache-03:
> apache-03:
> + [ 1 = 1 ]
> + shrink /home/tg/pcmk-Wed-07-Aug-2013
> + olddir=/home/tg
> + dirname /home/tg/pcmk-Wed-07-Aug-2013
> + dir=/home/tg
> + basename /home/tg/pcmk-Wed-07-Aug-2013
> + base=pcmk-Wed-07-Aug-2013
> + target=/home/tg/pcmk-Wed-07-Aug-2013.tar
> + tar_options=cf
> + pickfirst bzip2 gzip false
> + which bzip2
> + echo bzip2
> + return 0
> + variant=bzip2
> + tar_options=jcf
> + target=/home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + [ -e /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2 ]
> + cd /home/tg
> + tar jcf /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2 pcmk-Wed-07-Aug-2013
> + cd /home/tg
> + echo /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + fname=/home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + rm -rf /home/tg/pcmk-Wed-07-Aug-2013
> + log Collected results are available in /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + printf %-10s  Collected results are available in 
> /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2\n apache-03:
> apache-03:  Collected results are available in 
> /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + log
> + printf %-10s   \n apache-03:
> apache-03:
> + log Please create a bug entry at
> + printf %-10s  Please create a bug entry at\n apache-03:
> apache-03:  Please create a bug entry at
> + log     
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> + printf %-10s      
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker\n 
> apache-03:
> apache-03:      
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> + log Include a description of your problem and attach this tarball
> + printf %-10s  Include a description of your problem and attach this 
> tarball\n apache-03:
> apache-03:  Include a description of your problem and attach this tarball
> + log
> + printf %-10s   \n apache-03:
> apache-03:
> + log Thank you for taking time to create this report.
> + printf %-10s  Thank you for taking time to create this report.\n apache-03:
> apache-03:  Thank you for taking time to create this report.
> + log
> + printf %-10s   \n apache-03:

It really helps to read the output of the commands you're running:

Did you not see these messages the first time?

apache-03:  WARN: Unknown cluster type: any
apache-03:  ERROR: Could not determine the location of your cluster logs, try 
specifying --logfile /some/path
apache-04:  ERROR: Could not determine the location of your cluster logs, try 
specifying --logfile /some/path

Try adding -H and --logfile {somevalue} next time.


> 
> Resulting file is here:
> https://thomas.glanzmann.de/tmp/pcmk-Wed-07-Aug-2013.tar.bz2
> 
>> I can't do anything with the core file I'm afraid.  I don't run debian
>> at all, let alone that particular version with the same binaries,
>> libraries and symbols as you.  Without those, the core file is
>> meaningless (which is why crm_report generates backtraces).
> 
> I see, I also think that Debian does not package the debug symbols so
> that the core files are really useless. Please point me to the right
> packages if I'm wrong.

I have no experience with debian.

> 
>> That shouldn't have resulted in a crash.
> 
> It does. Also I tried to reproduce it on a 32 BIT System and the system
> at least rebooted both nodes at the same time but did not loose the
> config and this time crm just reported an error and did not core dump.
> 
>> I would _really_ recommend upgrading to something a little more
>> recent.  And it might be time to get off heartbeat while you're at it.
> 
> Just to be absolutly sure: I should upgrade to the most recent pacemaker
> release and use corosync as communication layer?

An updated pacemaker is the important part.  
Whether you switch to corosync too is up to you.

Pacemaker+heartbeat is by far the least tested combination.

> 
> I tried corosync a few years back and I was annoyed because back than it
> could not handle more than two heartbeat links between the nodes,
> however I saw that it now can and the moment I don't need more anyway.
> 
> Has anyone Debian packages that can be used in production or should I
> package it myself?

Best to poke the debian maintainers

> 
> Has someone a howto guide howto use the peer outdater with corosync?

I'm sure linbit has one somewhere

> 
> One last question about maintance mode: I want to use maintance mode to
> change the configuration without affecting the production. See that the
> monitors take the system out of maintance mode and than try the
> failover.  I already have verified the resource agents work correctly. Is
> that a valid use of the maintance mode or should I always test my setup
> on a lab system and only than put into the production system?

Do you mean "See that the monitors _work, then_ take the system out of 
maintance mode..."?
If so, then yes.

> 
> Cheers,
>        Thomas
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to