Yeah I can do that now. On Tue, Mar 17, 2015 at 2:53 AM, Vinayakumar B <vinayakum...@apache.org> wrote:
> Seems like all builds of Precommit-HDFS-Build failing with below problem. > > FATAL: Command "git clean -fdx" returned status code 1: > stdout: > stderr: hudson.plugins.git.GitException > < > http://stacktrace.jenkins-ci.org/search?query=hudson.plugins.git.GitException > >: > Command "git clean -fdx" returned status code 1: > stdout: > stderr: > > > > Can someone remove "git clean -fdx" from build configurations of > Precommit-HDFS-Build ? > > > Regards, > Vinay > > On Tue, Mar 17, 2015 at 12:59 PM, Vinayakumar B <vinayakum...@apache.org> > wrote: > > > I have simulated the problem in my env and verified that, both 'git clean > > -xdf' and 'mvn clean' will not remove the directory. > > mvn fails where as git simply ignores (not even display any warning) the > > problem. > > > > > > > > Regards, > > Vinay > > > > On Tue, Mar 17, 2015 at 2:32 AM, Sean Busbey <bus...@cloudera.com> > wrote: > > > >> Can someone point me to an example build that is broken? > >> > >> On Mon, Mar 16, 2015 at 3:52 PM, Sean Busbey <bus...@cloudera.com> > wrote: > >> > >> > I'm on it. HADOOP-11721 > >> > > >> > On Mon, Mar 16, 2015 at 3:44 PM, Haohui Mai <whe...@apache.org> > wrote: > >> > > >> >> +1 for git clean. > >> >> > >> >> Colin, can you please get it in ASAP? Currently due to the jenkins > >> >> issues, we cannot close the 2.7 blockers. > >> >> > >> >> Thanks, > >> >> Haohui > >> >> > >> >> > >> >> > >> >> On Mon, Mar 16, 2015 at 11:54 AM, Colin P. McCabe < > cmcc...@apache.org> > >> >> wrote: > >> >> > If all it takes is someone creating a test that makes a directory > >> >> > without -x, this is going to happen over and over. > >> >> > > >> >> > Let's just fix the problem at the root by running "git clean -fqdx" > >> in > >> >> > our jenkins scripts. If there's no objections I will add this in > and > >> >> > un-break the builds. > >> >> > > >> >> > best, > >> >> > Colin > >> >> > > >> >> > On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu <l...@cloudera.com> wrote: > >> >> >> I filed HDFS-7917 to change the way to simulate disk failures. > >> >> >> > >> >> >> But I think we still need infrastructure folks to help with > jenkins > >> >> >> scripts to clean the dirs left today. > >> >> >> > >> >> >> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui <ricet...@gmail.com> > >> >> wrote: > >> >> >>> Any updates on this issues? It seems that all HDFS jenkins builds > >> are > >> >> >>> still failing. > >> >> >>> > >> >> >>> Regards, > >> >> >>> Haohui > >> >> >>> > >> >> >>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B < > >> >> vinayakum...@apache.org> wrote: > >> >> >>>> I think the problem started from here. > >> >> >>>> > >> >> >>>> > >> >> > >> > https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/ > >> >> >>>> > >> >> >>>> As Chris mentioned TestDataNodeVolumeFailure is changing the > >> >> permission. > >> >> >>>> But in this patch, ReplicationMonitor got NPE and it got > terminate > >> >> signal, > >> >> >>>> due to which MiniDFSCluster.shutdown() throwing Exception. > >> >> >>>> > >> >> >>>> But, TestDataNodeVolumeFailure#teardown() is restoring those > >> >> permission > >> >> >>>> after shutting down cluster. So in this case IMO, permissions > were > >> >> never > >> >> >>>> restored. > >> >> >>>> > >> >> >>>> > >> >> >>>> @After > >> >> >>>> public void tearDown() throws Exception { > >> >> >>>> if(data_fail != null) { > >> >> >>>> FileUtil.setWritable(data_fail, true); > >> >> >>>> } > >> >> >>>> if(failedDir != null) { > >> >> >>>> FileUtil.setWritable(failedDir, true); > >> >> >>>> } > >> >> >>>> if(cluster != null) { > >> >> >>>> cluster.shutdown(); > >> >> >>>> } > >> >> >>>> for (int i = 0; i < 3; i++) { > >> >> >>>> FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), > >> >> true); > >> >> >>>> FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), > >> >> true); > >> >> >>>> } > >> >> >>>> } > >> >> >>>> > >> >> >>>> > >> >> >>>> Regards, > >> >> >>>> Vinay > >> >> >>>> > >> >> >>>> On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B < > >> >> vinayakum...@apache.org> > >> >> >>>> wrote: > >> >> >>>> > >> >> >>>>> When I see the history of these kind of builds, All these are > >> >> failed on > >> >> >>>>> node H9. > >> >> >>>>> > >> >> >>>>> I think some or the other uncommitted patch would have created > >> the > >> >> problem > >> >> >>>>> and left it there. > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> Regards, > >> >> >>>>> Vinay > >> >> >>>>> > >> >> >>>>> On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey < > >> bus...@cloudera.com> > >> >> wrote: > >> >> >>>>> > >> >> >>>>>> You could rely on a destructive git clean call instead of > maven > >> to > >> >> do the > >> >> >>>>>> directory removal. > >> >> >>>>>> > >> >> >>>>>> -- > >> >> >>>>>> Sean > >> >> >>>>>> On Mar 11, 2015 4:11 PM, "Colin McCabe" < > cmcc...@alumni.cmu.edu > >> > > >> >> wrote: > >> >> >>>>>> > >> >> >>>>>> > Is there a maven plugin or setting we can use to simply > remove > >> >> >>>>>> > directories that have no executable permissions on them? > >> >> Clearly we > >> >> >>>>>> > have the permission to do this from a technical point of > view > >> >> (since > >> >> >>>>>> > we created the directories as the jenkins user), it's simply > >> >> that the > >> >> >>>>>> > code refuses to do it. > >> >> >>>>>> > > >> >> >>>>>> > Otherwise I guess we can just fix those tests... > >> >> >>>>>> > > >> >> >>>>>> > Colin > >> >> >>>>>> > > >> >> >>>>>> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com> > >> >> wrote: > >> >> >>>>>> > > Thanks a lot for looking into HDFS-7722, Chris. > >> >> >>>>>> > > > >> >> >>>>>> > > In HDFS-7722: > >> >> >>>>>> > > TestDataNodeVolumeFailureXXX tests reset data dir > >> permissions > >> >> in > >> >> >>>>>> > TearDown(). > >> >> >>>>>> > > TestDataNodeHotSwapVolumes reset permissions in a finally > >> >> clause. > >> >> >>>>>> > > > >> >> >>>>>> > > Also I ran mvn test several times on my machine and all > >> tests > >> >> passed. > >> >> >>>>>> > > > >> >> >>>>>> > > However, since in DiskChecker#checkDirAccess(): > >> >> >>>>>> > > > >> >> >>>>>> > > private static void checkDirAccess(File dir) throws > >> >> >>>>>> DiskErrorException { > >> >> >>>>>> > > if (!dir.isDirectory()) { > >> >> >>>>>> > > throw new DiskErrorException("Not a directory: " > >> >> >>>>>> > > + dir.toString()); > >> >> >>>>>> > > } > >> >> >>>>>> > > > >> >> >>>>>> > > checkAccessByFileMethods(dir); > >> >> >>>>>> > > } > >> >> >>>>>> > > > >> >> >>>>>> > > One potentially safer alternative is replacing data dir > >> with a > >> >> regular > >> >> >>>>>> > > file to stimulate disk failures. > >> >> >>>>>> > > > >> >> >>>>>> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth < > >> >> >>>>>> cnaur...@hortonworks.com> > >> >> >>>>>> > wrote: > >> >> >>>>>> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure, > >> >> >>>>>> > >> TestDataNodeVolumeFailureReporting, and > >> >> >>>>>> > >> TestDataNodeVolumeFailureToleration all remove executable > >> >> permissions > >> >> >>>>>> > from > >> >> >>>>>> > >> directories like the one Colin mentioned to simulate disk > >> >> failures at > >> >> >>>>>> > data > >> >> >>>>>> > >> nodes. I reviewed the code for all of those, and they > all > >> >> appear to > >> >> >>>>>> be > >> >> >>>>>> > >> doing the necessary work to restore executable > permissions > >> at > >> >> the > >> >> >>>>>> end of > >> >> >>>>>> > >> the test. The only recent uncommitted patch I¹ve seen > that > >> >> makes > >> >> >>>>>> > changes > >> >> >>>>>> > >> in these test suites is HDFS-7722. That patch still > looks > >> >> fine > >> >> >>>>>> > though. I > >> >> >>>>>> > >> don¹t know if there are other uncommitted patches that > >> >> changed these > >> >> >>>>>> > test > >> >> >>>>>> > >> suites. > >> >> >>>>>> > >> > >> >> >>>>>> > >> I suppose it¹s also possible that the JUnit process > >> >> unexpectedly died > >> >> >>>>>> > >> after removing executable permissions but before > restoring > >> >> them. > >> >> >>>>>> That > >> >> >>>>>> > >> always would have been a weakness of these test suites, > >> >> regardless of > >> >> >>>>>> > any > >> >> >>>>>> > >> recent changes. > >> >> >>>>>> > >> > >> >> >>>>>> > >> Chris Nauroth > >> >> >>>>>> > >> Hortonworks > >> >> >>>>>> > >> http://hortonworks.com/ > >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >> >>>>>> > >> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com> > >> >> wrote: > >> >> >>>>>> > >> > >> >> >>>>>> > >>>Hey Colin, > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>I asked Andrew Bayer, who works with Apache Infra, what's > >> >> going on > >> >> >>>>>> with > >> >> >>>>>> > >>>these boxes. He took a look and concluded that some perms > >> are > >> >> being > >> >> >>>>>> set > >> >> >>>>>> > in > >> >> >>>>>> > >>>those directories by our unit tests which are precluding > >> >> those files > >> >> >>>>>> > from > >> >> >>>>>> > >>>getting deleted. He's going to clean up the boxes for us, > >> but > >> >> we > >> >> >>>>>> should > >> >> >>>>>> > >>>expect this to keep happening until we can fix the test > in > >> >> question > >> >> >>>>>> to > >> >> >>>>>> > >>>properly clean up after itself. > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>To help narrow down which commit it was that started > this, > >> >> Andrew > >> >> >>>>>> sent > >> >> >>>>>> > me > >> >> >>>>>> > >>>this info: > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS- > >> >> >>>>>> > > >> >> >>>>>> > >> >> > >> > >>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/ > >> >> >>>>>> > has > >> >> >>>>>> > >>>500 perms, so I'm guessing that's the problem. Been that > >> way > >> >> since > >> >> >>>>>> 9:32 > >> >> >>>>>> > >>>UTC > >> >> >>>>>> > >>>on March 5th." > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>-- > >> >> >>>>>> > >>>Aaron T. Myers > >> >> >>>>>> > >>>Software Engineer, Cloudera > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe < > >> >> cmcc...@apache.org > >> >> >>>>>> > > >> >> >>>>>> > >>>wrote: > >> >> >>>>>> > >>> > >> >> >>>>>> > >>>> Hi all, > >> >> >>>>>> > >>>> > >> >> >>>>>> > >>>> A very quick (and not thorough) survey shows that I > can't > >> >> find any > >> >> >>>>>> > >>>> jenkins jobs that succeeded from the last 24 hours. > Most > >> >> of them > >> >> >>>>>> seem > >> >> >>>>>> > >>>> to be failing with some variant of this message: > >> >> >>>>>> > >>>> > >> >> >>>>>> > >>>> [ERROR] Failed to execute goal > >> >> >>>>>> > >>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean > >> >> >>>>>> (default-clean) > >> >> >>>>>> > >>>> on project hadoop-hdfs: Failed to clean project: Failed > >> to > >> >> delete > >> >> >>>>>> > >>>> > >> >> >>>>>> > >>>> > >> >> >>>>>> > > >> >> >>>>>> > > >> >> >>>>>> > >> >> > >> > >>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hdfs-pr > >> >> >>>>>> > >>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3 > >> >> >>>>>> > >>>> -> [Help 1] > >> >> >>>>>> > >>>> > >> >> >>>>>> > >>>> Any ideas how this happened? Bad disk, unit test > setting > >> >> wrong > >> >> >>>>>> > >>>> permissions? > >> >> >>>>>> > >>>> > >> >> >>>>>> > >>>> Colin > >> >> >>>>>> > >>>> > >> >> >>>>>> > >> > >> >> >>>>>> > > > >> >> >>>>>> > > > >> >> >>>>>> > > > >> >> >>>>>> > > -- > >> >> >>>>>> > > Lei (Eddy) Xu > >> >> >>>>>> > > Software Engineer, Cloudera > >> >> >>>>>> > > >> >> >>>>>> > >> >> >>>>> > >> >> >>>>> > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Lei (Eddy) Xu > >> >> >> Software Engineer, Cloudera > >> >> > >> > > >> > > >> > > >> > -- > >> > Sean > >> > > >> > >> > >> > >> -- > >> Sean > >> > > > > > -- Sean