Hi Folks! After working on test-patch with other folks for the last few months, I think we've reached the point where we can make the fastest progress towards the goal of a general use pre-commit patch tester by spinning things into a project focused on just that. I think we have a mature enough code base and a sufficient fledgling community, so I'm going to put together a tlp proposal.
Thanks for the feedback thus far from use within Hadoop. I hope we can continue to make things more useful. -Sean On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey <bus...@cloudera.com> wrote: > HBase's dev-support folder is where the scripts and support files live. > We've only recently started adding anything to the maven builds that's > specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd > add in more if we ran into the same permissions problems y'all are having. > > There's also our precommit job itself, though it isn't large[2]. AFAIK, we > don't properly back this up anywhere, we just notify each other of changes > on a particular mail thread[3]. > > [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687 > [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all > read because I just finished fixing "mvn site" running out of permgen) > [3]: http://s.apache.org/NT0 > > > On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth <cnaur...@hortonworks.com> > wrote: > >> Sure, thanks Sean! Do we just look in the dev-support folder in the HBase >> repo? Is there any additional context we need to be aware of? >> >> Chris Nauroth >> Hortonworks >> http://hortonworks.com/ >> >> >> >> >> >> >> On 3/11/15, 2:44 PM, "Sean Busbey" <bus...@cloudera.com> wrote: >> >> >+dev@hbase >> > >> >HBase has recently been cleaning up our precommit jenkins jobs to make >> >them >> >more robust. From what I can tell our stuff started off as an earlier >> >version of what Hadoop uses for testing. >> > >> >Folks on either side open to an experiment of combining our precommit >> >check >> >tooling? In principle we should be looking for the same kinds of things. >> > >> >Naturally we'll still need different jenkins jobs to handle different >> >resource needs and we'd need to figure out where stuff eventually lives, >> >but that could come later. >> > >> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth <cnaur...@hortonworks.com >> > >> >wrote: >> > >> >> The only thing I'm aware of is the failOnError option: >> >> >> >> >> >> >> http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro >> >>rs >> >> .html >> >> >> >> >> >> I prefer that we don't disable this, because ignoring different kinds >> of >> >> failures could leave our build directories in an indeterminate state. >> >>For >> >> example, we could end up with an old class file on the classpath for >> >>test >> >> runs that was supposedly deleted. >> >> >> >> I think it's worth exploring Eddy's suggestion to try simulating >> failure >> >> by placing a file where the code expects to see a directory. That >> might >> >> even let us enable some of these tests that are skipped on Windows, >> >> because Windows allows access for the owner even after permissions have >> >> been stripped. >> >> >> >> Chris Nauroth >> >> Hortonworks >> >> http://hortonworks.com/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 3/11/15, 2:10 PM, "Colin McCabe" <cmcc...@alumni.cmu.edu> wrote: >> >> >> >> >Is there a maven plugin or setting we can use to simply remove >> >> >directories that have no executable permissions on them? Clearly we >> >> >have the permission to do this from a technical point of view (since >> >> >we created the directories as the jenkins user), it's simply that the >> >> >code refuses to do it. >> >> > >> >> >Otherwise I guess we can just fix those tests... >> >> > >> >> >Colin >> >> > >> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com> wrote: >> >> >> Thanks a lot for looking into HDFS-7722, Chris. >> >> >> >> >> >> In HDFS-7722: >> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in >> >> >>TearDown(). >> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause. >> >> >> >> >> >> Also I ran mvn test several times on my machine and all tests >> passed. >> >> >> >> >> >> However, since in DiskChecker#checkDirAccess(): >> >> >> >> >> >> private static void checkDirAccess(File dir) throws >> >>DiskErrorException { >> >> >> if (!dir.isDirectory()) { >> >> >> throw new DiskErrorException("Not a directory: " >> >> >> + dir.toString()); >> >> >> } >> >> >> >> >> >> checkAccessByFileMethods(dir); >> >> >> } >> >> >> >> >> >> One potentially safer alternative is replacing data dir with a >> >>regular >> >> >> file to stimulate disk failures. >> >> >> >> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth >> >> >><cnaur...@hortonworks.com> wrote: >> >> >>> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure, >> >> >>> TestDataNodeVolumeFailureReporting, and >> >> >>> TestDataNodeVolumeFailureToleration all remove executable >> >>permissions >> >> >>>from >> >> >>> directories like the one Colin mentioned to simulate disk failures >> >>at >> >> >>>data >> >> >>> nodes. I reviewed the code for all of those, and they all appear >> >>to be >> >> >>> doing the necessary work to restore executable permissions at the >> >>end >> >> >>>of >> >> >>> the test. The only recent uncommitted patch I¹ve seen that makes >> >> >>>changes >> >> >>> in these test suites is HDFS-7722. That patch still looks fine >> >> >>>though. I >> >> >>> don¹t know if there are other uncommitted patches that changed >> these >> >> >>>test >> >> >>> suites. >> >> >>> >> >> >>> I suppose it¹s also possible that the JUnit process unexpectedly >> >>died >> >> >>> after removing executable permissions but before restoring them. >> >>That >> >> >>> always would have been a weakness of these test suites, regardless >> >>of >> >> >>>any >> >> >>> recent changes. >> >> >>> >> >> >>> Chris Nauroth >> >> >>> Hortonworks >> >> >>> http://hortonworks.com/ >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com> wrote: >> >> >>> >> >> >>>>Hey Colin, >> >> >>>> >> >> >>>>I asked Andrew Bayer, who works with Apache Infra, what's going on >> >>with >> >> >>>>these boxes. He took a look and concluded that some perms are being >> >> >>>>set in >> >> >>>>those directories by our unit tests which are precluding those >> files >> >> >>>>from >> >> >>>>getting deleted. He's going to clean up the boxes for us, but we >> >>should >> >> >>>>expect this to keep happening until we can fix the test in question >> >>to >> >> >>>>properly clean up after itself. >> >> >>>> >> >> >>>>To help narrow down which commit it was that started this, Andrew >> >>sent >> >> >>>>me >> >> >>>>this info: >> >> >>>> >> >> >>>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS- >> >> >> >> >>>>>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3 >> >>>>>>/ >> >> >>>>has >> >> >>>>500 perms, so I'm guessing that's the problem. Been that way since >> >>9:32 >> >> >>>>UTC >> >> >>>>on March 5th." >> >> >>>> >> >> >>>>-- >> >> >>>>Aaron T. Myers >> >> >>>>Software Engineer, Cloudera >> >> >>>> >> >> >>>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe >> >><cmcc...@apache.org> >> >> >>>>wrote: >> >> >>>> >> >> >>>>> Hi all, >> >> >>>>> >> >> >>>>> A very quick (and not thorough) survey shows that I can't find >> any >> >> >>>>> jenkins jobs that succeeded from the last 24 hours. Most of them >> >> >>>>>seem >> >> >>>>> to be failing with some variant of this message: >> >> >>>>> >> >> >>>>> [ERROR] Failed to execute goal >> >> >>>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean >> >>(default-clean) >> >> >>>>> on project hadoop-hdfs: Failed to clean project: Failed to delete >> >> >>>>> >> >> >>>>> >> >> >> >> >>>>>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hd >> >>>>>>>fs >> >> >>>>>-pr >> >> >>>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3 >> >> >>>>> -> [Help 1] >> >> >>>>> >> >> >>>>> Any ideas how this happened? Bad disk, unit test setting wrong >> >> >>>>> permissions? >> >> >>>>> >> >> >>>>> Colin >> >> >>>>> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Lei (Eddy) Xu >> >> >> Software Engineer, Cloudera >> >> >> >> >> > >> > >> >-- >> >Sean >> >> > > > -- > Sean > -- Sean