2011/1/27 nozawat <noza...@gmail.com>: > Hi > > I was able to complete CTS. > A bad point is the following points. > 1)Python after 2.5 is necessary. > However, I used 2.4 of RHEL5.5. > Therefore I carried out CTS with Python 2.6.5 of RHEL6.0.
Ah! > 2)The following environment variables are necessary. > * cluster_log=/share/ha/logs/ha-log-local7 > * cluster_hosts="cts0101 cts0102" CTS doesn't do anything with environment variables - unless you're using my cts-run() function from the release testing page. It is also sufficient to set them on the command line (as you did) with: --nodes "cts0201 cts0202" and --logfile /share/ha/logs/ha-log-local7 > 3)It started to add stonith-enabled to cib-bootstrap-options to let you > read cib.xml. > The following errors occur unless they do so it. > ----- > Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR: > unpack_resources: Resource start-up disabled since no STONITH resources have > been defined > Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR: > unpack_resources: Either configure some or disable STONITH with the > stonith-enabled option > Jan 27 12:02:39 BadNews: Jan 27 12:01:17 cts0201 pengine: [14630]: ERROR: > unpack_resources: NOTE: Clusters with shared data need STONITH to ensure > data integrity > ---- That looks pretty normal, we don't test with --stonith no > The log is as follows when I carried out CTS. > --- > [buildbot@bbs02 /usr/share/pacemaker/tests/cts]$ python CTSlab.py --nodes > "cts0201 cts0202" --at-boot 1 --stack corosync --stonith no --logfile > /share/ha/logs/ha-log-local7 --syslog-facility local7 --cib-filename > /share/ha/cib.xml 10 > Jan 27 15:48:41 Random seed is: 1296110921 > Jan 27 15:48:41 >>>>>>>>>>>>>>>> BEGINNING 10 TESTS > Jan 27 15:48:41 Stack: corosync (flatiron) > Jan 27 15:48:41 Schema: pacemaker-1.0 > Jan 27 15:48:41 Scenario: Random Test Execution > Jan 27 15:48:41 Random Seed: 1296110921 > Jan 27 15:48:41 System log files: /share/ha/logs/ha-log-local7 > Jan 27 15:48:41 Cluster nodes: > Jan 27 15:48:41 * cts0201 > Jan 27 15:48:41 * cts0202 > Jan 27 15:48:53 Testing for syslog logs > Jan 27 15:48:53 Testing for remote logs > Jan 27 15:49:31 Continuing with remote-based log reader > Jan 27 15:49:42 Stopping Cluster Manager on all nodes > Jan 27 15:49:42 Starting Cluster Manager on all nodes. > Jan 27 15:49:42 Starting crm-flatiron on node cts0201 > Jan 27 15:51:07 Starting crm-flatiron on node cts0202 > Jan 27 15:52:40 Running test SimulStop (cts0202) [ 1] > Jan 27 15:53:39 Running test NearQuorumPoint (cts0202) [ 2] > Jan 27 15:55:34 Running test ComponentFail (cts0201) [ 3] > Jan 27 15:57:40 Running test Reattach (cts0202) [ 4] > Jan 27 16:02:35 Running test SimulStop (cts0201) [ 5] > Jan 27 16:03:26 Running test SpecialTest1 (cts0201) [ 6] > Jan 27 16:06:19 Running test ComponentFail (cts0201) [ 7] > Jan 27 16:07:17 Running test SpecialTest1 (cts0201) [ 8] > Jan 27 16:10:47 Running test ComponentFail (cts0201) [ 9] > Jan 27 16:11:42 BadNews: Jan 27 16:11:03 cts0201 crmd: [23399]: ERROR: > stonithd_op_result_ready: not signed on Was there a bug here? Otherwise it looks good, glad you were able to get it going in the end. > Jan 27 16:11:45 Running test ResourceRecover (cts0202) [ 10] > Jan 27 16:11:46 No active resources on cts0202 > Jan 27 16:12:03 Stopping Cluster Manager on all nodes > Jan 27 16:12:03 Stopping crm-flatiron on node cts0201 > Jan 27 16:12:26 Stopping crm-flatiron on node cts0202 > Jan 27 16:13:09 **************** > Jan 27 16:13:09 Overall Results:{'failure': 0, 'skipped': 0, 'success': 10, > 'BadNews': 1} > Jan 27 16:13:09 **************** > Jan 27 16:13:09 Test Summary > Jan 27 16:13:09 Test Flip: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test Restart: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test StartOnebyOne: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test SimulStart: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test SimulStop: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 2} > Jan 27 16:13:09 Test StopOnebyOne: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test RestartOnebyOne: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test PartialStart: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test Standby: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 0} > Jan 27 16:13:09 Test ResourceRecover: {'auditfail': 0, 'failure': 0, > 'skipped': 1, 'calls': 1} > Jan 27 16:13:09 Test ComponentFail: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 3} > Jan 27 16:13:09 Test Reattach: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 1} > Jan 27 16:13:09 Test SpecialTest1: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 2} > Jan 27 16:13:09 Test NearQuorumPoint: {'auditfail': 0, 'failure': 0, > 'skipped': 0, 'calls': 1} > Jan 27 16:13:09 <<<<<<<<<<<<<<<< TESTS COMPLETED > ----- > > The URL to take into account is as follows. > http://www.clusterlabs.org/wiki/Release_Testing > > Regards, > Tomo > > 2011年1月26日20:01 nozawat <noza...@gmail.com>: >> >> Hi Andrew >> >> Where is filename of cts_log_watcher.py set? >> cts_log_watcher is made by /tmp, but filename of this inside seems not to >> be changed by /var/log/messages. >> Or filename seems not to be handed by CTSlab.py. >> >> Regards, >> Tomo >> >> 2011年1月22日11:15 nozawat <noza...@gmail.com>: >>> >>> Hi >>> >>> Thank you for your reply. >>> I stopped a script in "CTL+C" after "Audit LogAudit FAILED" was output. >>> * bbs01-console.log -> central server console log >>> * ha-log-local7-bbs01 -> central server >>> * ha-log-local7-cts0101 -> cts server 1 >>> * ha-log-local7-cts0102 -> cts server 2 >>> >>> The real file name of the server log is ha-log-local7. >>> I renamed a file name to send it by an email. >>> They are made with all servers by /share/ha/logs subordinates. >>> >>> BTW, a file is made in /tmp. >>> Don't you have any problem in authority below? >>> --- >>> [11:14:23][root@bbs01 ~]$ ll /tmp >>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py >>> [11:12:39][root@cts0101 ~]$ ll /tmp >>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py >>> [11:13:36][root@cts0102 ~]$ ll /tmp >>> -rw-r--r-- 1 root root 1612 1月 22 10:44 cts_log_watcher.py >>> --- >>> >>> Regards, >>> Tomo >>> >>> 2011/1/22 Andrew Beekhof <and...@beekhof.net> >>>> >>>> On Fri, Jan 21, 2011 at 4:38 PM, nozawat <noza...@gmail.com> wrote: >>>> > Hi >>>> > >>>> > Thank you for your reply. >>>> > I logging with central server and both running CTS server. >>>> > A test message is output by both central server and running CTS >>>> > server. >>>> >>>> Can we see it please? >>>> >>>> > >>>> > Regards, >>>> > Tomo >>>> > >>>> > 2011/1/21 Andrew Beekhof <and...@beekhof.net> >>>> >> >>>> >> On Fri, Jan 21, 2011 at 6:03 AM, nozawat <noza...@gmail.com> wrote: >>>> >> > Hi >>>> >> > >>>> >> > I ran CTS in the following environment. >>>> >> > * OS:RHEL5.5-x86_64 >>>> >> > * pacemaker-1.0.9.1-1.15.el5 >>>> >> > * TDN(bbs01) >>>> >> > * TNNs(cts0101 cts0102) >>>> >> > >>>> >> > Probably it is a phenomenon like the following. >>>> >> > http://www.gossamer-threads.com/lists/linuxha/pacemaker/69322 >>>> >> > >>>> >> > SSH login without password -> OK. >>>> >> > Syslog Message transfer by syslog-ng -> OK. >>>> >> >>>> >> You're logging to a central server? The same server you're running >>>> >> CTS >>>> >> on? >>>> >> If so, what is the contents of /share/ha/logs/ha-log-local7 on that >>>> >> machine? Because that is where CTS is looking. >>>> >> >>>> >> > >>>> >> > ------- >>>> >> > $ python /usr/share/pacemaker/tests/cts/CTSlab.py --nodes "cts0101 >>>> >> > cts0102" >>>> >> > --at-boot 1 --stack heartbeat --stonith no --logfile >>>> >> > /share/ha/logs/ha-log-local7 --syslog-facility local7 1 >>>> >> > Jan 21 13:23:08 Random seed is: 1295583788 >>>> >> > Jan 21 13:23:08 >>>>>>>>>>>>>>>> BEGINNING 1 TESTS >>>> >> > Jan 21 13:23:08 Stack: heartbeat >>>> >> > Jan 21 13:23:08 Schema: pacemaker-1.0 >>>> >> > Jan 21 13:23:08 Scenario: Random Test Execution >>>> >> > Jan 21 13:23:08 Random Seed: 1295583788 >>>> >> > Jan 21 13:23:08 System log files: /share/ha/logs/ha-log-local7 >>>> >> > Jan 21 13:23:08 Cluster nodes: >>>> >> > Jan 21 13:23:08 * cts0101 >>>> >> > Jan 21 13:23:08 * cts0102 >>>> >> > Jan 21 13:23:12 Testing for syslog logs >>>> >> > Jan 21 13:23:12 Testing for remote logs >>>> >> > Jan 21 13:24:16 Restarting logging on: ['cts0101', 'cts0102'] >>>> >> > Jan 21 13:25:49 Restarting logging on: ['cts0101', 'cts0102'] >>>> >> > Jan 21 13:28:21 Restarting logging on: ['cts0101', 'cts0102'] >>>> >> > Jan 21 13:31:54 Restarting logging on: ['cts0101', 'cts0102'] >>>> >> > Jan 21 13:35:54 ERROR: Cluster logging unrecoverable. >>>> >> > Jan 21 13:35:54 Audit LogAudit FAILED. >>>> >> > ----- >>>> >> > >>>> >> > I run it in heartbeat, but a similar error occurs in corosync. >>>> >> > I become the error in "Single search timed out" in the log and >>>> >> > seem to >>>> >> > retry. >>>> >> > ----- >>>> >> > Jan 21 13:23:11 bbs01 CTS: debug: Audit DiskspaceAudit passed. >>>> >> > Jan 21 13:23:12 bbs01 CTS: Testing for syslog logs >>>> >> > Jan 21 13:23:12 bbs01 CTS: Testing for remote logs >>>> >> > Jan 21 13:23:12 bbs01 CTS: debug: lw: >>>> >> > cts0101:/share/ha/logs/ha-log-local7: >>>> >> > Installing /tmp/cts_log_watcher.py on cts0101 >>>> >> > Jan 21 13:23:12 bbs01 CTS: debug: lw: >>>> >> > cts0102:/share/ha/logs/ha-log-local7: >>>> >> > Installing /tmp/cts_log_watcher.py on cts0102 >>>> >> > Jan 21 13:23:13 cts0102 logger: Test message from cts0102 >>>> >> > Jan 21 13:23:13 cts0101 logger: Test message from cts0101 >>>> >> > Jan 21 13:23:44 bbs01 CTS: debug: lw: LogAudit: Single search timed >>>> >> > out: >>>> >> > timeout=30, start=1295583793, limit=1295583824, now=1295583824 >>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: lw: LogAudit: Single search timed >>>> >> > out: >>>> >> > timeout=30, start=1295583824, limit=1295583855, now=1295583856 >>>> >> > Jan 21 13:24:16 bbs01 CTS: Restarting logging on: ['cts0101', >>>> >> > 'cts0102'] >>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: cmd: async: target=cts0101, >>>> >> > rc=22203: >>>> >> > /etc/init.d/syslog-ng restart 2>&1 > /dev/null >>>> >> > Jan 21 13:24:16 bbs01 CTS: debug: cmd: async: target=cts0102, >>>> >> > rc=22204: >>>> >> > /etc/init.d/syslog-ng restart 2>&1 > /dev/null >>>> >> > Jan 21 13:25:17 cts0102 logger: Test message from cts0102 >>>> >> > Jan 21 13:25:17 cts0101 logger: Test message from cts0101 >>>> >> > ----- >>>> >> > >>>> >> > The test case seems to be carried out after this error. >>>> >> > However, the script is finished by an error. It is because "Audit >>>> >> > LogAudit >>>> >> > FAILED" occurs. >>>> >> > Is it right that how becomes the result of the CTS? >>>> >> > >>>> >> > Regards, >>>> >> > Tomo >>>> >> > >>>> >> > >>>> >> > _______________________________________________ >>>> >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >> > >>>> >> > Project Home: http://www.clusterlabs.org >>>> >> > Getting started: >>>> >> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> >> > Bugs: >>>> >> > >>>> >> > >>>> >> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> >> > >>>> >> > >>>> >> >>>> >> _______________________________________________ >>>> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >> >>>> >> Project Home: http://www.clusterlabs.org >>>> >> Getting started: >>>> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> >> Bugs: >>>> >> >>>> >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> > >>>> > >>>> > _______________________________________________ >>>> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> > >>>> > Project Home: http://www.clusterlabs.org >>>> > Getting started: >>>> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> > Bugs: >>>> > >>>> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> > >>>> > >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: >>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>> >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker