How about putting this tool in dev-support directory ? Thanks
On Aug 30, 2014, at 11:10 PM, Yongjun Zhang <yzh...@cloudera.com> wrote: > Hi, > > I developed a tool to detect flaky tests of hadoop jenkins test jobs, on > top of the initial work Todd Lipcon did. We find it quite useful, and with > Todd's agreement, I'd like to push it to upstream so all of us can share > (thanks Todd for the initial work and support). I hope you find the tool > useful. > > This is a tool for hadoop contributors rather than hadoop users. And it can > certainly be adapted to projects other than hadoop. I wonder where would be > a good place to put it. Your advice is very much appreciated. > > Please see below the description and example output of the tool. > > Thanks a lot. > > --Yongjun > > Description of the tool: > > # > # Given a jenkins test job, this script examines all runs of the job done > # within specified period of time (number of days prior to the execution > # time of this script), and reports all failed tests. > # > # The output of this script includes a section for each run that has failed > # tests, with each failed test name listed. > # > # More importantly, at the end, it outputs a summary section to list all > failed > # tests within all examined runs, and indicate how many runs a same test > # failed, and sorted all failed tests by how many runs each test failed in. > # > # This way, when we see failed tests in PreCommit build, we can quickly > tell > # whether a failed test is a new failure or it failed before, and it may > just > # be a flaky test. > # > # Of course, to be 100% sure about the reason of a failued test, closer > look > # at the failed test for the specific run is necessary. > # > > Example usage and output of the tool for job Hadoop-Common-0.23-Build, > which indicates that the same test failed five times in a row: > > ./determine-flaky-tests-hadoop.py -j Hadoop-Common-0.23-Build > ****Recently FAILED builds in url: > https://builds.apache.org//job/Hadoop-Common-0.23-Build > THERE ARE 5 builds (out of 5) that have failed tests in the past 14 > days, as listed below: > > ==>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1057/testReport > (2014-08-30 02:01:30) > Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > ==>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1056/testReport > (2014-08-29 02:01:30) > Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > ==>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1055/testReport > (2014-08-28 02:01:30) > Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > ==>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1054/testReport > (2014-08-27 02:01:29) > Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > ==>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1053/testReport > (2014-08-26 02:01:30) > Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > > All failed tests <#occurrences: testName>: > 5: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec > > > Another example (for job Hadoop-Hdfs-trunk): > > [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -n 7 > ****Recently FAILED builds in url: > https://builds.apache.org//job/Hadoop-Hdfs-trunk > THERE ARE 7 builds (out of 8) that have failed tests in the past 7 > days, as listed below: > > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1856/testReport > (2014-08-30 09:46:54) > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testIdempotentAllocateBlockAndClose > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testFailuresArePerOperation > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testRetryOnChecksumFailure > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testWriteTimeoutAtDataNode > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testDFSClientRetriesOnBusyBlocks > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testClientDNProtocolTimeout > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testGetFileChecksum > Failed test: > org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1855/testReport > (2014-08-30 04:31:30) > Failed test: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer > Failed test: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testUnevenDistribution > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1854/testReport > (2014-08-29 04:31:30) > Could not open testReport > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1853/testReport > (2014-08-28 09:37:18) > Could not open testReport > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1852/testReport > (2014-08-28 09:28:48) > Could not open testReport > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1850/testReport > (2014-08-27 04:31:30) > Failed test: > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS.testEnd2End > ==>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1849/testReport > (2014-08-26 04:31:29) > Failed test: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity > > All failed tests <#occurrences: testName>: > 1: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testIdempotentAllocateBlockAndClose > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testFailuresArePerOperation > 1: > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS.testEnd2End > 1: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testUnevenDistribution > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testRetryOnChecksumFailure > 1: > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testWriteTimeoutAtDataNode > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testDFSClientRetriesOnBusyBlocks > 1: > org.apache.hadoop.hdfs.TestDFSClientRetries.testClientDNProtocolTimeout > 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testGetFileChecksum > 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart > [yzhang@localhost jenkinsftf]$ > > > > On Thu, Aug 28, 2014 at 8:04 PM, Yongjun Zhang <yzh...@cloudera.com> wrote: > >> Hi, >> >> I just noticed that the recent jenkin test report doesn't include link to >> test result, however, the email notice does show the failed tests: >> >> E.g. >> >> https://builds.apache.org/job/PreCommit-HDFS-Build/7846// >> >> Example old job report that has the link: >> >> https://builds.apache.org/job/PreCommit-HDFS-Build/7590/ >> >> Would any one please take a look? >> >> Thanks a lot. >> >> --Yongjun >> >> On Thu, Aug 28, 2014 at 4:21 PM, Karthik Kambatla <ka...@cloudera.com> >> wrote: >> >>> Thanks Giri and Ted for fixing the builds. >>> >>> >>> On Thu, Aug 28, 2014 at 9:49 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Charles: >>>> QA build is running for your JIRA: >>>> https://builds.apache.org/job/PreCommit-hdfs-Build/7828/parameters/ >>>> >>>> Cheers >>>> >>>> >>>> On Thu, Aug 28, 2014 at 9:41 AM, Charles Lamb <cl...@cloudera.com> >>> wrote: >>>> >>>>> On 8/28/2014 12:07 PM, Giridharan Kesavan wrote: >>>>> >>>>>> Fixed all the 3 pre-commit buids. test-patch's git reset --hard is >>>>>> removing >>>>>> the patchprocess dir, so moved it off the workspace. >>>>> Thanks Giri. Should I resubmit HDFS-6954's patch? I've gotten 3 or 4 >>>>> jenkins messages that indicated the problem so something is >>> resubmitting, >>>>> but now that you've fixed it, should I resubmit it again? >>>>> >>>>> Charles >> >>