Re: .23 compile times
Hi Sriram, You may want to try adding your source tree to Spotlight's privacy list, otherwise it's busy trying to index files as fast as the build is writing them. If you are (un)fortunate enough to be running anti-virus sw, then it too is scurrying along behind your build checking every file. That's enough to turn a normally silent MBP into a hair dryer. If you don't want to omit the entire project, because perhaps you want to use mdfind, then you can rename dirs to end with ".noindex". The trick/hack is you can symlink the original dir name to the renamed dir.noindex. But yes, I think the build is way to slow to be productive when used as intended. The fact that people have to use custom tricks to build & test in a "reasonable" amount of time is a symptom of a problem... Daryn On Feb 15, 2012, at 12:47 PM, Todd Lipcon wrote: > Hi Sriram, > > I also do -Dmaven.skip.javadoc=true usually and that cuts out a couple > minutes. Unless you need a tarball, you can also drop -Ptar. > > When I'm just making quick changes confined to the java code, I > usually just use "mvn -Dskiptests install" from within the hdfs > project, then manually cp the resulting > target/hadoop-hdfs-0.24.0-SNAPSHOT.jar into my install > dir/share/hadoop/hdfs. Much faster than the full rebuild. > > -Todd > > On Wed, Feb 15, 2012 at 8:52 AM, Harsh J wrote: >> Sriram, >> >> My package command didn't take as long with the same command -- where >> do you see it getting stuck at most of the times? Which particular >> target (right term?) seems to take the highest time to complete? >> >> On Wed, Feb 15, 2012 at 12:31 AM, Sriram Rao wrote: >>> Folks, >>> >>> The build takes forever: >>> >>> mvn package -Pdist -DskipTests -Dtar >>> >>> >>> >>> [INFO] >>> >>> [INFO] BUILD SUCCESS >>> [INFO] >>> >>> [INFO] Total time: 6:25.399s >>> [INFO] Finished at: Tue Feb 14 10:54:18 PST 2012 >>> [INFO] Final Memory: 56M/123M >>> [INFO] >>> >>> >>> This is on a mac with 8GB RAM. (The above case involved 0-lines of code >>> change; just making sure I had everything built!). >>> >>> Is there a faster way to get the thing to build? >>> >>> Sriram >> >> >> >> -- >> Harsh J >> Customer Ops. Engineer >> Cloudera | http://tiny.cloudera.com/about > > > > -- > Todd Lipcon > Software Engineer, Cloudera
Re: .23 compile times
Wow, nice! How long is a rebuild with no changes? I'd be delighted if there was a dev profile that just did the bare minimum to produce a working env. Daryn On Feb 16, 2012, at 3:25 PM, Alejandro Abdelnur wrote: > I've just done some tweaks in the POMs and I'm cutting the 8.30min full > dist build down to 3.45min. I'm removing all javadoc stuff from the dist > profile. > > I think we should introduce a developer profile that does this. > > Also, the 'mvn install -DskipTests', to get the JARs in your m2 cache is > taking 2mins. But we are building source JARs, the developer profile could > skip the generation of the source JARs, this would cut several seconds. > > Finally, note that to just build all JARs 'mvn clean test -DskipTests' it > is taking 1.55mins. > > If folk agree I'll open a JIRA to optmize the build based on these findings. > > Thxs. > > Alejandro > > On Wed, Feb 15, 2012 at 1:02 PM, Daryn Sharp wrote: > >> Hi Sriram, >> >> You may want to try adding your source tree to Spotlight's privacy list, >> otherwise it's busy trying to index files as fast as the build is writing >> them. If you are (un)fortunate enough to be running anti-virus sw, then it >> too is scurrying along behind your build checking every file. That's >> enough to turn a normally silent MBP into a hair dryer. >> >> If you don't want to omit the entire project, because perhaps you want to >> use mdfind, then you can rename dirs to end with ".noindex". The >> trick/hack is you can symlink the original dir name to the renamed >> dir.noindex. >> >> But yes, I think the build is way to slow to be productive when used as >> intended. The fact that people have to use custom tricks to build & test >> in a "reasonable" amount of time is a symptom of a problem... >> >> Daryn >> >> >> On Feb 15, 2012, at 12:47 PM, Todd Lipcon wrote: >> >>> Hi Sriram, >>> >>> I also do -Dmaven.skip.javadoc=true usually and that cuts out a couple >>> minutes. Unless you need a tarball, you can also drop -Ptar. >>> >>> When I'm just making quick changes confined to the java code, I >>> usually just use "mvn -Dskiptests install" from within the hdfs >>> project, then manually cp the resulting >>> target/hadoop-hdfs-0.24.0-SNAPSHOT.jar into my install >>> dir/share/hadoop/hdfs. Much faster than the full rebuild. >>> >>> -Todd >>> >>> On Wed, Feb 15, 2012 at 8:52 AM, Harsh J wrote: >>>> Sriram, >>>> >>>> My package command didn't take as long with the same command -- where >>>> do you see it getting stuck at most of the times? Which particular >>>> target (right term?) seems to take the highest time to complete? >>>> >>>> On Wed, Feb 15, 2012 at 12:31 AM, Sriram Rao >> wrote: >>>>> Folks, >>>>> >>>>> The build takes forever: >>>>> >>>>> mvn package -Pdist -DskipTests -Dtar >>>>> >>>>> >>>>> >>>>> [INFO] >>>>> >> >>>>> [INFO] BUILD SUCCESS >>>>> [INFO] >>>>> >> >>>>> [INFO] Total time: 6:25.399s >>>>> [INFO] Finished at: Tue Feb 14 10:54:18 PST 2012 >>>>> [INFO] Final Memory: 56M/123M >>>>> [INFO] >>>>> >> >>>>> >>>>> This is on a mac with 8GB RAM. (The above case involved 0-lines of >> code >>>>> change; just making sure I had everything built!). >>>>> >>>>> Is there a faster way to get the thing to build? >>>>> >>>>> Sriram >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> Customer Ops. Engineer >>>> Cloudera | http://tiny.cloudera.com/about >>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >> >>
Re: .23 compile times
Very nice improvement! How long is "mvn install -DskipTests -Dmaven.javadoc.skip"? Assuming that's fast, is there possibly a way to allow hadoop to run out of the checkedout tree? Maybe through the CLASSPATH, or by subprojects copying the built jars into the dist area? The ideal would be to run a full maven build, followed by just a mvn build within a specific subproject. Daryn On Feb 16, 2012, at 6:15 PM, Alejandro Abdelnur wrote: > A 'mvn package -DskipTests -Pdist' (developer) with no changes takes > 3.35mins > > The biggest offenders are: > > [INFO] Apache Hadoop Common .. SUCCESS [43.299s] > [INFO] Apache Hadoop HDFS SUCCESS [54.776s] > [INFO] hadoop-yarn-api ... SUCCESS [15.466s] > [INFO] hadoop-mapreduce .. SUCCESS [16.154s] > [INFO] Apache Hadoop Distributed Copy SUCCESS [9.657s] > > Everybody else is below 5secs > > A 'mvn test -DskipTests' with no changes 1.05mins (this should go down > making the some codegen to be smart instead just compiling again) > > The biggest offenders are: > > [INFO] Apache Hadoop Common .. SUCCESS [9.121s] > [INFO] Apache Hadoop HDFS SUCCESS [20.887s] > [INFO] hadoop-yarn-api ... SUCCESS [6.664s] > > A few 2sec modules and the rest below 1sec > > Thanks. > > Alejandro > > > On Thu, Feb 16, 2012 at 2:13 PM, Daryn Sharp wrote: > >> Wow, nice! How long is a rebuild with no changes? >> >> I'd be delighted if there was a dev profile that just did the bare minimum >> to produce a working env. >> >> Daryn >> >> >> On Feb 16, 2012, at 3:25 PM, Alejandro Abdelnur wrote: >> >>> I've just done some tweaks in the POMs and I'm cutting the 8.30min full >>> dist build down to 3.45min. I'm removing all javadoc stuff from the dist >>> profile. >>> >>> I think we should introduce a developer profile that does this. >>> >>> Also, the 'mvn install -DskipTests', to get the JARs in your m2 cache is >>> taking 2mins. But we are building source JARs, the developer profile >> could >>> skip the generation of the source JARs, this would cut several seconds. >>> >>> Finally, note that to just build all JARs 'mvn clean test -DskipTests' it >>> is taking 1.55mins. >>> >>> If folk agree I'll open a JIRA to optmize the build based on these >> findings. >>> >>> Thxs. >>> >>> Alejandro >>> >>> On Wed, Feb 15, 2012 at 1:02 PM, Daryn Sharp >> wrote: >>> >>>> Hi Sriram, >>>> >>>> You may want to try adding your source tree to Spotlight's privacy list, >>>> otherwise it's busy trying to index files as fast as the build is >> writing >>>> them. If you are (un)fortunate enough to be running anti-virus sw, >> then it >>>> too is scurrying along behind your build checking every file. That's >>>> enough to turn a normally silent MBP into a hair dryer. >>>> >>>> If you don't want to omit the entire project, because perhaps you want >> to >>>> use mdfind, then you can rename dirs to end with ".noindex". The >>>> trick/hack is you can symlink the original dir name to the renamed >>>> dir.noindex. >>>> >>>> But yes, I think the build is way to slow to be productive when used as >>>> intended. The fact that people have to use custom tricks to build & >> test >>>> in a "reasonable" amount of time is a symptom of a problem... >>>> >>>> Daryn >>>> >>>> >>>> On Feb 15, 2012, at 12:47 PM, Todd Lipcon wrote: >>>> >>>>> Hi Sriram, >>>>> >>>>> I also do -Dmaven.skip.javadoc=true usually and that cuts out a couple >>>>> minutes. Unless you need a tarball, you can also drop -Ptar. >>>>> >>>>> When I'm just making quick changes confined to the java code, I >>>>> usually just use "mvn -Dskiptests install" from within the hdfs >>>>> project, then manually cp the resulting >>>>> target/hadoop-hdfs-0.24.0-SNAPSHOT.jar into my install >>>>> dir/share/hadoop/hdfs. Much faster than t
Re: Apache Hadoop works with IPv6?
Hi Marcos, Hadoop won't work in a pure IPv6 environment. Hadoop might work in a IPv4/IPv6 environment since the default is to prefer IPv4 addresses. It also might work if they are publishing IPv4 addrs over IPv6. The main problem is that hadoop heavily relies on strings containing "ip:port". The string is naively parsed to split on the first colon which obviously isn't going to work on IPv6 addresses since they contain colons. Awhile back I updated some common apis with IPv6 support in mind -- the support isn't complete though. Bad news: there are quite a number of places throughout hadoop not using the apis. Good news: For other reasons, I'm currently fixing many of those places to use the apis. After that work is complete, enabling IPv6 support might be within relatively easy to moderate reach. Daryn On Mar 21, 2012, at 4:45 PM, Marcos Ortiz wrote: > Regards. > I'm very interested to know if Apache Hadoop works with IPv6 hosts. One > of my clients > has some hosts with this feature and they want to know if Hadoop > supports this. > Anyone has tested this? > > Best wishes > > -- > Marcos Luis Ortíz Valmaseda (@marcosluis2186) > Data Engineer at UCI > http://marcosluis2186.posterous.com > > > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS > INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cu > http://www.facebook.com/universidad.uci > http://www.flickr.com/photos/universidad_uci
Re: Apache Hadoop works with IPv6?
No, there aren't jiras that I've filed, or I'm aware of, for the remaining IPv6 work. Daryn On Mar 23, 2012, at 3:01 PM, Marcos Ortiz wrote: Thanks a lot, Daryn. Do you have any JIRA issue for this work? It would be nice to work on this. Regards On 03/22/2012 03:02 PM, Daryn Sharp wrote: Hi Marcos, Hadoop won't work in a pure IPv6 environment. Hadoop might work in a IPv4/IPv6 environment since the default is to prefer IPv4 addresses. It also might work if they are publishing IPv4 addrs over IPv6. The main problem is that hadoop heavily relies on strings containing "ip:port". The string is naively parsed to split on the first colon which obviously isn't going to work on IPv6 addresses since they contain colons. Awhile back I updated some common apis with IPv6 support in mind -- the support isn't complete though. Bad news: there are quite a number of places throughout hadoop not using the apis. Good news: For other reasons, I'm currently fixing many of those places to use the apis. After that work is complete, enabling IPv6 support might be within relatively easy to moderate reach. Daryn On Mar 21, 2012, at 4:45 PM, Marcos Ortiz wrote: Regards. I'm very interested to know if Apache Hadoop works with IPv6 hosts. One of my clients has some hosts with this feature and they want to know if Hadoop supports this. Anyone has tested this? Best wishes -- Marcos Luis Ortíz Valmaseda (@marcosluis2186) Data Engineer at UCI http://marcosluis2186.posterous.com<http://marcosluis2186.posterous.com/> [http://universidad.uci.cu/email.gif] <http://www.uci.cu/> <http://www.uci.cu/> <http://www.uci.cu/>
Re: Help with error
The copyFromLocalFile method has a void return, but internally is calling FileUtil methods that may return a boolean for success. False appears to be returned if the source file cannot be deleted, or if the dest directory cannot be created. The boolean result is ignored by copyFromLocalFile leading the caller to believe the copy was successful. I'm not sure if this bug is aggravating your situation, so I'd try to manually create the dest dir and remove the src file. Daryn On Apr 9, 2012, at 4:51 PM, Ralph Castain wrote: > > On Apr 9, 2012, at 3:50 PM, Ralph Castain wrote: > >> >> On Apr 9, 2012, at 2:45 PM, Kihwal Lee wrote: >> >>> The path, "file:/Users/rhc/yarnrun/13", indicates that your copy >>> operation's destination was the local file system, instead of hdfs. >> >> Yeah, I realized that too after I sent the note. Sure enough - the files are >> there. > > > Quick correction: the path exists, but as a file instead of a directory, and > therefore the files to be moved there don't exist. > > >> >>> What is the value of "fs.default.name" set to in core-site.xml? >> >> >> >> fs.default.name >> hdfs://localhost:9000 >> >> >> >> >>> >>> Kihwal >>> >>> >>> On 4/9/12 3:26 PM, "Ralph Castain" wrote: >>> >>> Finally managed to chase down the 0.23 API docs and get the FileStatus >>> definition. No real joy here - I output the path and got: >>> >>> code: LOG.info("destination path " + destStatus.getPath()); >>> >>> 2012-04-09 14:22:48,359 INFO Hamster (Hamster.java:getApplication(265)) - >>> destination path file:/Users/rhc/yarnrun/13 >>> >>> However, when I attempt to list it: >>> >>> Ralphs-iMac:bin rhc$ ./hdfs dfs -ls /Users/rhc/yarnrun >>> 2012-04-09 14:22:57.640 java[14292:1903] Unable to load realm info from >>> SCDynamicStore >>> 2012-04-09 14:22:57.686 java[14292:1903] Unable to load realm info from >>> SCDynamicStore >>> ls: `/Users/rhc/yarnrun': No such file or directory >>> >>> I've been unable to track down the "realm" warnings, so I don't know if >>> that is pertinent or not. It appears the files are not getting copied >>> across, though the location looks okay to my eyes. >>> >>> >>> On Apr 9, 2012, at 1:27 PM, Kihwal Lee wrote: >>> It looks like the home directory does not exist but the copy went through. Can you try to LOG the key fields in destStatus including path? It might be ending up in an unexpected place. Kihwal On 4/9/12 12:45 PM, "Ralph Castain" wrote: Hi Bobby On Apr 9, 2012, at 11:40 AM, Robert Evans wrote: > What do you mean by relocated some supporting files to HDFS? How do you > relocate them? What API do you use? I use the LocalResource and FileSystem classes to do the relocation, per the Hadoop example: // set local resources for the application master // local files or archives as needed // In this scenario, the jar file for the application master is part of the local resources Map localResources = new HashMap>>> LocalResource>(); LOG.info("Copy openmpi tarball from local filesystem and add to local environment"); // Copy the application master jar to the filesystem // Create a local resource to point to the destination jar path FileSystem fs; FileStatus destStatus; try { fs = FileSystem.get(conf); Path src = new Path(pathOMPItarball); String pathSuffix = appName + "/" + appId.getId(); Path dst = new Path(fs.getHomeDirectory(), pathSuffix); try { fs.copyFromLocalFile(false, true, src, dst); try { destStatus = fs.getFileStatus(dst); LocalResource amJarRsrc = Records.newRecord(LocalResource.class); // Set the type of resource - file or archive // archives are untarred at destination amJarRsrc.setType(LocalResourceType.ARCHIVE); // Set visibility of the resource // Setting to most private option amJarRsrc.setVisibility(LocalResourceVisibility.APPLICATION); // Set the resource to be copied over amJarRsrc.setResource(ConverterUtils.getYarnUrlFromPath(dst)); // Set timestamp and length of file so that the framework // can do basic sanity checks for the local resource // after it has been copied over to ensure it is the same // resource the client intended to use with the application amJarRsrc.setTimestamp(destStatus.getModificationTime()); amJarRsrc.setSize(destStatus.getLen()); localResources.put("openmpi", amJarRsrc);
Re: [VOTE] Plan to create release candidate for 0.23.7
+1 (non-binding) On Mar 18, 2013, at 1:05 AM, Tsuyoshi OZAWA wrote: > +1 (non-binding) > > On Sun, Mar 17, 2013 at 9:01 AM, Harsh J wrote: >> +1 >> >> >> On Sat, Mar 16, 2013 at 12:19 AM, Karthik Kambatla wrote: >> >>> +1 (non-binding) >>> >>> On Fri, Mar 15, 2013 at 9:12 AM, Robert Evans wrote: >>> +1 On 3/13/13 11:31 AM, "Thomas Graves" wrote: > Hello all, > > I think enough critical bug fixes have went in to branch-0.23 that > warrant another release. I plan on creating a 0.23.7 release by the end > March. > > Please vote '+1' to approve this plan. Voting will close on Wednesday > 3/20 at 10:00am PDT. > > Thanks, > Tom Graves > (release manager) >>> >> >> >> >> -- >> Harsh J > > > > -- > - Tsuyoshi
Re: TestHDFSCLI is broken
Hi Eli, I noticed the issue yesterday. It's from a recent change of mine in common, and I'm not sure how I didn't catch the problem... I must have missed doing a veryclean in hdfs before running the tests. I'll have a patch up this morning. Daryn On Jun 9, 2011, at 5:33 PM, Eli Collins wrote: > Hey guys, > > TestHDFSCLI is failing on trunk. It's been failing for several days > (so it's not HDFS-494). > > https://builds.apache.org/job/Hadoop-Hdfs-trunk/lastCompletedBuild/testReport > > Is anyone looking at this or aware of what change would have caused > this? Looks like it started on June 7th. > > The output of the test is noisy enough that it's hard to quickly see > what caused the issue. > > Thanks, > Eli
Re: [DISCUSS] Hadoop SSO/Token Server Components
Sorry for falling out of the loop. I'm catching up the jiras and discussion, and will comment this afternoon. Daryn On Jul 10, 2013, at 8:42 AM, Larry McCay wrote: > All - > > After combing through this thread - as well as the summit session summary > thread, I think that we have the following two items that we can probably > move forward with: > > 1. TokenAuth method - assuming this means the pluggable authentication > mechanisms within the RPC layer (2 votes: Kai and Kyle) > 2. An actual Hadoop Token format (2 votes: Brian and myself) > > I propose that we attack both of these aspects as one. Let's provide the > structure and interfaces of the pluggable framework for use in the RPC layer > through leveraging Daryn's pluggability work and POC it with a particular > token format (not necessarily the only format ever supported - we just need > one to start). If there has already been work done in this area by anyone > then please speak up and commit to providing a patch - so that we don't > duplicate effort. > > @Daryn - is there a particular Jira or set of Jiras that we can look at to > discern the pluggability mechanism details? Documentation of it would be > great as well. > @Kai - do you have existing code for the pluggable token authentication > mechanism - if not, we can take a stab at representing it with interfaces > and/or POC code. > I can standup and say that we have a token format that we have been working > with already and can provide a patch that represents it as a contribution to > test out the pluggable tokenAuth. > > These patches will provide progress toward code being the central discussion > vehicle. As a community, we can then incrementally build on that foundation > in order to collaboratively deliver the common vision. > > In the absence of any other home for posting such patches, let's assume that > they will be attached to HADOOP-9392 - or a dedicated subtask for this > particular aspect/s - I will leave that detail to Kai. > > @Alejandro, being the only voice on this thread that isn't represented in the > votes above, please feel free to agree or disagree with this direction. > > thanks, > > --larry > > On Jul 5, 2013, at 3:24 PM, Larry McCay wrote: > >> Hi Andy - >> >>> Happy Fourth of July to you and yours. >> >> Same to you and yours. :-) >> We had some fun in the sun for a change - we've had nothing but rain on the >> east coast lately. >> >>> My concern here is there may have been a misinterpretation or lack of >>> consensus on what is meant by "clean slate" >> >> >> Apparently so. >> On the pre-summit call, I stated that I was interested in reconciling the >> jiras so that we had one to work from. >> >> You recommended that we set them aside for the time being - with the >> understanding that work would continue on your side (and our's as well) - >> and approach the community discussion from a clean slate. >> We seemed to do this at the summit session quite well. >> It was my understanding that this community discussion would live beyond the >> summit and continue on this list. >> >> While closing the summit session we agreed to follow up on common-dev with >> first a summary then a discussion of the moving parts. >> >> I never expected the previous work to be abandoned and fully expected it to >> inform the discussion that happened here. >> >> If you would like to reframe what clean slate was supposed to mean or >> describe what it means now - that would be welcome - before I waste anymore >> time trying to facilitate a community discussion that is apparently not >> wanted. >> >>> Nowhere in this >>> picture are self appointed "master JIRAs" and such, which have been >>> disappointing to see crop up, we should be collaboratively coding not >>> planting flags. >> >> I don't know what you mean by self-appointed master JIRAs. >> It has certainly not been anyone's intention to disappoint. >> Any mention of a new JIRA was just to have a clear context to gather the >> agreed upon points - previous and/or existing JIRAs would easily be linked. >> >> Planting flags… I need to go back and read my discussion point about the >> JIRA and see how this is the impression that was made. >> That is not how I define success. The only flags that count is code. What we >> are lacking is the roadmap on which to put the code. >> >>> I read Kai's latest document as something approaching today's consensus (or >>> at least a common point of view?) rather than a historical document. >>> Perhaps he and it can be given equal share of the consideration. >> >> I definitely read it as something that has evolved into something >> approaching what we have been talking about so far. There has not however >> been enough discussion anywhere near the level of detail in that document >> and more details are needed for each component in the design. >> Why the work in that document should not be fed into the community >> discussion as anyone el
Re: [VOTE] Release Apache Hadoop 2.1.0-beta
I broke RPC QOP for integrity and privacy options. :( See blocker HADOOP-9816. I think I understand the problem and it shouldn't be hard to fix. The bug went unnoticed because sadly there are no unit tests for the QOP options, even though it just involves a conf setting. Daryn On Jul 29, 2013, at 5:00 PM, Arun C Murthy wrote: > Ok, I think we are close to rc1 now - the last of blockers should be > committed later today… I'll try and spin RC1 tonight. > > thanks, > Arun > > On Jul 21, 2013, at 12:43 AM, Devaraj Das wrote: > >> I have just raised https://issues.apache.org/jira/browse/HDFS-5016 .. This >> bug can easily be reproduced by some HBase tests. I'd like this to be >> considered before we make a beta release. Have spoken about this with some >> hdfs folks offline and I am told that it is being worked on. >> >> Thanks >> Devaraj >> >> >> On Wed, Jul 17, 2013 at 4:25 PM, Alejandro Abdelnur wrote: >> >>> As I've mentioned in my previous email, if we get YARN-701 in, we should >>> also get in the fix for unmanaged AMs in an un-secure setup in 2.1.0-beta. >>> Else is a regression of a functionality it is already working. >>> >>> Because of that, to avoid continuing delaying the release, I'm suggesting >>> to mention in the release notes the API changes and behavior changes that >>> YARN-918 and YARN-701 will bring into the next beta or GA release. >>> >>> thx >>> >>> >>> On Wed, Jul 17, 2013 at 4:14 PM, Vinod Kumar Vavilapalli < >>> vino...@hortonworks.com> wrote: >>> On Jul 17, 2013, at 1:04 PM, Alejandro Abdelnur wrote: > * YARN-701 > > It should be addressed before a GA release. > > Still, as it is this breaks unmanaged AMs and to me > that would be a blocker for the beta. > > YARN-701 and the unmanaged AMs fix should be committed > in tandem. > > * YARN-918 > > It is a consequence of YARN-701 and depends on it. YARN-918 is an API change. And YARN-701 is a behaviour change. We need both in 2.1.0. > * YARN-926 > > It would be nice to have it addressed before GA release. Either ways. I'd get it in sooner than later specifically when we are trying to replace the old API with the new one. Thanks, +Vino >>> > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > >
Re: [VOTE] Release Apache Hadoop 2.1.0-beta
I've been OOO (got a call to fix this bug), but just to clarify: We're ok with HA _not working at all_ with security enabled in 2.1.0-beta? That's the ramification of omitting HADOOP-9880. Daryn On Aug 16, 2013, at 9:20 PM, Arun C Murthy wrote: > Yep, there are quite a number of such fixes in 2.1.1 ATM, I think it will > serve us better to get 2.1.0 out and then quickly turn around to make 2.1.1. > > My current plan is to start work on 2.1.1 right after this release gets > complete… hopefully next week. > > thanks, > Arun > > On Aug 16, 2013, at 4:36 PM, Vinod Kumar Vavilapalli > wrote: > >> >> There are other such isolated and well understood bug-fixes that we pushed >> to 2.1.1 in the interesting of making progress with 2.1.0 and the >> corresponding API changes. >> >> 2.1.1 should happen soon enough after this. >> >> Thanks, >> +Vinod >> >> On Aug 16, 2013, at 4:22 PM, Roman Shaposhnik wrote: >> >>> What are the downsides of getting this fix into the 2.1? It appears >>> that the fix is pretty isolated and well understood. >>> >>> Thoughts? >>> >>> Thanks, >>> Roman. >>> >>> On Fri, Aug 16, 2013 at 3:04 PM, Kihwal Lee wrote: I've changed the target version of HADOOP-9880 to 2.1.1. Please change it back, if you feel that it needs to be in 2.1.0-beta. Kihwal From: Kihwal Lee To: Arun Murthy ; "common-dev@hadoop.apache.org" Cc: "mapreduce-...@hadoop.apache.org" ; "hdfs-...@hadoop.apache.org" ; "yarn-...@hadoop.apache.org" Sent: Friday, August 16, 2013 4:55 PM Subject: Re: [VOTE] Release Apache Hadoop 2.1.0-beta It's your call, Arun. I.e. as long you believe rc2 meets the expectations and objectives of 2.1.0-beta. Kihwal From: Arun Murthy To: "common-dev@hadoop.apache.org" Cc: Kihwal Lee ; "mapreduce-...@hadoop.apache.org" ; "hdfs-...@hadoop.apache.org" ; "yarn-...@hadoop.apache.org" Sent: Friday, August 16, 2013 3:44 PM Subject: Re: [VOTE] Release Apache Hadoop 2.1.0-beta That makes sense too. On Aug 16, 2013, at 10:39 AM, Vinod Kumar Vavilapalli wrote: > > We need to make a call on what blockers will be. From my limited > understanding, this doesn't seem like a API or a compatibility issue. Can > we not fix it in subsequent bug-fix releases? > > I do see a lot of follow up releases to 2.1.0. Getting this release out > will help downstream projects start testing with all the API stuff that > has already gone in 2.1.0. > > Thanks, > +Vinod > > On Aug 16, 2013, at 10:13 AM, Kihwal Lee wrote: > >> We have found HADOOP-9880, which prevents Namenode HA from running with >> security. >> >> Kihwal >> >> >> >> From: Arun C Murthy >> To: "common-dev@hadoop.apache.org" ; >> "hdfs-...@hadoop.apache.org" ; >> "mapreduce-...@hadoop.apache.org" ; >> "yarn-...@hadoop.apache.org" >> Sent: Thursday, August 15, 2013 4:15 PM >> Subject: [VOTE] Release Apache Hadoop 2.1.0-beta >> >> >> Folks, >> >> I've created a release candidate (rc2) for hadoop-2.1.0-beta that I >> would like to get released - this fixes the bugs we saw since the last >> go-around (rc1). >> >> The RC is available at: >> http://people.apache.org/~acmurthy/hadoop-2.1.0-beta-rc2/ >> The RC tag in svn is here: >> http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.0-beta-rc2 >> >> The maven artifacts are available via repository.apache.org. >> >> Please try the release and vote; the vote will run for the usual 7 days. >> >> thanks, >> Arun >> >> -- >> Arun C. Murthy >> Hortonworks Inc. >> http://hortonworks.com/ >> >> >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified >> that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender >> immediately >> and delete it from your system. Thank You. > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under appli
Re: symlink support in Hadoop 2 GA
I reluctantly agree that we should disable symlinks in 2.2 until we can sort out the compatibility issues. I'm reluctant in the sense that its a feature users have long wanted, and it's something we'd like to use from an administrative view. However I don't see all the issues being shorted out in the very near future. I filed some jiras today that have led me to believe that the current implementation of fs symlinks is irreparably flawed. Adding optional primitives to filesystems to make them symlink capable is ok. However, adding symlink resolution to individual filesystems is fundamentally broken. It doesn't work for stacked filesystems (viewfs, chroots, filters, etc) because the resolution must occur at the highest level, not within an individual filesystem itself. Otherwise the abstraction of the top-level filesystem is violated and all kinds of unexpected behavior like walking out of chroots becomes possible. Daryn On Oct 3, 2013, at 1:39 PM, sanjay Radia wrote: > There are a number of issues (some minor, some more than minor). > GA is close and we are are still in discussion on the some of them; while I > believe we will close on these very very shortly, code change like this so > close to GA is dangerous. > > I suggest we do the following: > 1) Disable Symlinks in 2.2 GA- throw unsupported exception on createSymlink > in both FileSystem and FileContext. > 2) Deal with the isDir() in 2.2GA in preparation for item 3 coming after GA: > a) Deprecate isDir() >b) Add a new API that returns an enum (see FileContext). > 3) Fix Symlinks, in a future release, hopefully the very next one after 2.2GA > a) change the stack to use the new API replacing isDir(). > b) fix isDIr() to do something smarter (we can detail this later but there > is a solution that has been discussed). This helps customer applications that > call isDir(). > c) Remove isDir in a future release when customers have had sufficient time > to migrate. > > sanjay > > PS. J Rottinghuis expressed a similar sentiment in a previous email in this > thread: > > > > On Sep 18, 2013, at 5:11 PM, J. Rottinghuis wrote: > >> I like symlink functionality, but in our migration to Hadoop 2.x this is a >> total distraction. If the APIs stay in 2.2 GA we'll have to choose to: >> a) Not uprev until symlink support is figured out up and down the stack, >> and we've been able to migrate all our 1.x (equivalent) clusters to 2.x >> (equivalent). Or >> b) rip out the API altogether. Or >> c) change the implementation to throw an UnsupportedOperationException >> I'm not sure yet which of these I like least. > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.2.0
+1 I did some basic testing and it appears to work fine. Daryn On Oct 7, 2013, at 2:00 AM, Arun C Murthy wrote: > Folks, > > I've created a release candidate (rc0) for hadoop-2.2.0 that I would like to > get released - this release fixes a small number of bugs and some > protocol/api issues which should ensure they are now stable and will not > change in hadoop-2.x. > > The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 > The RC tag in svn is here: > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 > > The maven artifacts are available via repository.apache.org. > > Please try the release and vote; the vote will run for the usual 7 days. > > thanks, > Arun > > P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail down > the symlinks-related issues. I'll release note the fact that we have disabled > it in 2.2. Also, thanks to Vinod for some heavy-lifting on the YARN side in > the last couple of weeks. > > > > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.
Re: How can I get FSNamesystem of running NameNode in cluster?
Are you adding something internal to the NN? If not, you cannot get the namesystem instance via a client unless you are using a minicluster object. Daryn On Dec 9, 2013, at 7:11 AM, Yoonmin Nam wrote: > I want to get a running instance of FSNamesystem of HDFS. However, it is > somewhat complicated than I expected. > > If I can get NameNode instance of running cluster, then it can be solved > because there is a method "getNamespace()". > > Is there anyone who know about this stuff? > > I thought that using Servlet stuff is not normal way to do this because my > program is not web-application. > > Thanks! > > >
Re: [VOTE] Merging branch HDFS-7240 to trunk
I’m generally neutral and looked foremost at developer impact. Ie. Will it be so intertwined with hdfs that each project risks destabilizing the other? Will developers with no expertise in ozone will be impeded? I think the answer is currently no. These are the intersections and some concerns based on the assumption ozone is accepted into the project: Common Appear to be a number of superfluous changes. The conf servlet must not be polluted with specific references and logic for ozone. We don’t create dependencies from common to hdfs, mapred, yarn, hive, etc. Common must be “ozone free”. Datanode I expected ozone changes to be intricately linked with the existing blocks map, dataset, volume, etc. Thankfully it’s not. As an independent service, the DN should not be polluted with specific references to ozone. If ozone is in the project, the DN should have a generic plugin interface conceptually similar to the NM aux services. Namenode No impact, currently, but certainly will be… Code Location I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location. I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better hadoop-hdsl-project. This clean separation will make it easier to later spin off or pull in depending on which way we vote. Dependencies Owen hit upon his before I could send. Hadoop is already bursting with dependencies, I hope this doesn’t pull in a lot more. –– Do I think ozone be should be a separate project? If we view it only as a competing filesystem, then clearly yes. If it’s a low risk evolutionary step with near-term benefits, no, we want to keep it close and help it evolve. I think ozone/hdsl/whatever has been poorly marketed and an umbrella term for too many technologies that should perhaps be split. I'm interested in the container block management. I have little interest at this time in the key store. The usability of ozone, specifically container management, is unclear to me. It lacks basic features like changing replication factors, append, a migration path, security, etc - I know there are good plans for all of it - yet another goal is splicing into the NN. That’s a lot of high priority items to tackle that need to be carefully orchestrated before contemplating BM replacement. Each of those is a non-starter for (my) production environment. We need to make sure we can reach a consensus on the block level functionality before rushing it into the NN. That’s independent of whether allowing it into the project. The BM/SCM changes to the NN are realistically going to be contentious & destabilizing. If done correctly, the BM separation will be a big win for the NN. If ozone is out, by necessity interfaces will need to be stable and well-defined but we won’t get that right for a long time. Interface and logic changes that break the other will be difficult to coordinate and we’ll likely veto changes that impact the other. If ozone is in, we can hopefully synchronize the changes with less friction, but it greatly increases the chances of developers riddling the NN with hacks and/or ozone specific logic that makes it even more brittle. I will note we need to be vigilant against pervasive conditionals (ie. EC, snapshots). In either case, I think ozone must agree to not impede current hdfs work. I’ll compare to hdfs is a store owner that plans to maybe retire in 5 years. A potential new owner (ozone) is lined up and hdfs graciously gives them no-rent space (the DN). Precondition is help improve the store. Don’t make a mess and expect hdfs to clean it up. Don’t make renovations that complicate hdfs but ignore it due to anticipation of its departure/demise. I’m not implying that’s currently happening, it’s just what I don’t want to see. We as a community and our customers need an evolution, not a revolution, and definitively not a civil war. Hdfs has too much legacy code rot that is hard to change. Too many poorly implemented features. Perhaps I’m overly optimistic that freshly redesigned code can counterbalance performance degradations in the NN. I’m also reluctant, but realize it is being driven by some hdfs veterans that know/understand historical hdfs design strengths and flaws. If the initially cited issues are addressed, I’m +0.5 for the concept of bringing in ozone if it's not going to be a proverbial bull in the china shop. Daryn On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey wrote: > Dear folks, >We would like to start a vote to merge HDFS-7240 branch into > trunk. The context can be reviewed in the DISCUSSION thread, and in the > jiras (See references below). > > HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is > a distributed, replicated block layer. > The old HDFS namespace and NN can be connected to this new block layer > as we have described in HDFS-10419. > We also introduce a key-value namespace called Ozone built on HDSL. > > The code is in a sep
Re: [DISCUSS] Hadoop RPC encryption performance improvements
Various KMS tasks have been delaying my RPC encryption work – which is 2nd on TODO list. It's becoming a top priority for us so I'll try my best to get a preliminary netty server patch (sans TLS) up this week if that helps. The two cited jiras had some critical flaws. Skimming my comments, both use blocking IO (obvious nonstarter). HADOOP-10768 is a hand rolled TLS-like encryption which I don't feel is something the community can or should maintain from a security standpoint. Daryn On Wed, Oct 31, 2018 at 8:43 AM Wei-Chiu Chuang wrote: > Ping. Any one? Cloudera is interested in moving forward with the RPC > encryption improvements, but I just like to get a consensus which approach > to go with. > > Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've > spent time on testing it. > > On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang > wrote: > > > Folks, > > > > I would like to invite all to discuss the various Hadoop RPC encryption > > performance improvements. As you probably know, Hadoop RPC encryption > > currently relies on Java SASL, and have _really_ bad performance (in > terms > > of number of RPCs per second, around 15~20% of the one without SASL) > > > > There have been some attempts to address this, most notably, HADOOP-10768 > > <https://issues.apache.org/jira/browse/HADOOP-10768> (Optimize Hadoop > RPC > > encryption performance) and HADOOP-13836 > > <https://issues.apache.org/jira/browse/HADOOP-13836> (Securing Hadoop > RPC > > using SSL). But it looks like both attempts have not been progressing. > > > > During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's > > working on another approach that leverages Netty for its SSL encryption, > > and then integrate Netty with Hadoop RPC so that Hadoop RPC automatically > > benefits from netty's SSL encryption performance. > > > > So there are at least 3 attempts to address this issue as I see it. Do we > > have a consensus that: > > 1. this is an important problem > > 2. which approach we want to move forward with > > > > -- > > A very happy Hadoop contributor > > > > > -- > A very happy Hadoop contributor > -- Daryn
Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby
-1 pending additional info. After a cursory scan, I have serious concerns regarding the design. This seems like a feature that should have been purely implemented in hdfs w/o touching the common IPC layer. The biggest issue in the alignment context. It's purpose appears to be for allowing handlers to reinsert calls back into the call queue. That's completely unacceptable. A buggy or malicious client can easily cause livelock in the IPC layer with handlers only looping on calls that never satisfy the condition. Why is this not implemented via RetriableExceptions? On Thu, Dec 6, 2018 at 1:24 AM Yongjun Zhang wrote: > Great work guys. > > Wonder if we can elaborate what's impact of not having #2 fixed, and why #2 > is not needed for the feature to complete? > 2. Need to fix automatic failover with ZKFC. Currently it does not doesn't > know about ObserverNodes trying to convert them to SBNs. > > Thanks. > --Yongjun > > > On Wed, Dec 5, 2018 at 5:27 PM Konstantin Shvachko > wrote: > > > Hi Hadoop developers, > > > > I would like to propose to merge to trunk the feature branch HDFS-12943 > for > > Consistent Reads from Standby Node. The feature is intended to scale read > > RPC workloads. On large clusters reads comprise 95% of all RPCs to the > > NameNode. We should be able to accommodate higher overall RPC workloads > (up > > to 4x by some estimates) by adding multiple ObserverNodes. > > > > The main functionality has been implemented see sub-tasks of HDFS-12943. > > We followed up with the test plan. Testing was done on two independent > > clusters (see HDFS-14058 and HDFS-14059) with security enabled. > > We ran standard HDFS commands, MR jobs, admin commands including manual > > failover. > > We know of one cluster running this feature in production. > > > > There are a few outstanding issues: > > 1. Need to provide proper documentation - a user guide for the new > feature > > 2. Need to fix automatic failover with ZKFC. Currently it does not > doesn't > > know about ObserverNodes trying to convert them to SBNs. > > 3. Scale testing and performance fine-tuning > > 4. As testing progresses, we continue fixing non-critical bugs like > > HDFS-14116. > > > > I attached a unified patch to the umbrella jira for the review and > Jenkins > > build. > > Please vote on this thread. The vote will run for 7 days until Wed Dec > 12. > > > > Thanks, > > --Konstantin > > > -- Daryn
Re: [DISCUSS] Secure Hadoop without Kerberos
There’s a few too many issues being mixed here. We aren’t very far from having OIDC support. The pre-requisite RPC/TLS & RPC/mTLS recently completed rollout to our entire production grid. Majority of the past year was spent shaking out bugs and ensuring 100% compatibility. There are a few rough edges I need to clean up for a community release. A few weeks ago I created a rough POC to leverage RPC/mTLS with OIDC access tokens. Goal is a mTLS cert may be blessed to impersonate with an access token. A compromised service may only be abused to impersonate users that have recently accessed said service. Kerberos, mTLs, and OIDC may all be simultaneously supported. Part of the simplicity is regardless of the client’s authn/authz, delegation tokens are still acquired by jobs to avoid short-lived identity credential expiration. Credential refreshing is a bigger can of worms that requires careful thought and a separate discussion. On Thu, May 21, 2020 at 12:32 PM Vipin Rathor wrote: > Hi Eric, > > Thanks for starting this discussion. > > Kerberos was developed decade before web development becomes popular. > > There are some Kerberos limitations which does not work well in Hadoop. > > > Sure, Kerberos was developed long before the web but it was selected as de > facto authentication mechanism in Hadoop after the internet boom. And it > was selected for a reason - it is one of the strongest symmetric key based > authentication mechanism out there which doesn't transmit the password in > the plain text. Kerberos has been around since long and has stood the test > of time. > > Microsoft Active Directory, which is extensively used in many > > organizations, is based on Kerberos. > > > +1 to this. > And the fact that Microsoft has put Active Directory in Azure too, tells me > that AD (and thereof Kerberos) is not going away any time soon. > > Overall, I agree with Rajive and Craig on this topic. Paving way for the > OpenID Connect in Hadoop is a good idea but seeing it as a replacement to > Kerberos, needs to be carefully thought out. All the problems, that are > described in the original mail, are not really Kerberos issues. > Yes, we do understand that making Kerberos work *in a right way* is always > an uphill task (I'm a long time Kerberos+Hadoop Support Engineer) but that > shouldn't be the reason to replace it. > > Hint: CVE-2020-9492 > > > Btw, the CVE-2020-9492 is not accessible right now in the CVE database, > maybe it is not yet public. > > On Thu, May 21, 2020 at 9:22 AM Steve Loughran > > wrote: > > > On Wed, 6 May 2020 at 23:32, Eric Yang wrote: > > > > > Hi all, > > > > > > > > > 4. Passing different form of tokens does not work well with cloud > > provider > > > security mechanism. For example, passing AWS sts token for S3 bucket. > > > There is no renewal mechanism, nor good way to identify when the token > > > would expire. > > > > > > > > well, HADOOP-14556 does it fairly well, supporting session and role > tokens. > > We even know when they expire because we ask for a duration when we > request > > the session/role creds. > > See org.apache.hadoop.fs.s3a.auth.delegation.AbstractS3ATokenIdentifier > for > > the core of what we marshall, including encryption secrets. > > > > The main issue there is that Yarn can't refresh those tokens because a > new > > triple of session credentials are required; currently token renewal > assumes > > the token is unchanged and a request is made to the service to update > their > > table of issued tokens. But even if the RM could get back a new token > from > > a refresh call, we are left with the problem of "how to get an updated > set > > of creds to each process" > > >
[jira] [Created] (HADOOP-8311) FSInputStream's positioned read fails to check seek
Daryn Sharp created HADOOP-8311: --- Summary: FSInputStream's positioned read fails to check seek Key: HADOOP-8311 URL: https://issues.apache.org/jira/browse/HADOOP-8311 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.3, 0.24.0, 2.0.0 Reporter: Daryn Sharp {{FSInputStream#read(long, byte[], int, int)}} will seek into the input stream, but does not check that the seek actually succeeded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8311) FSInputStream's positioned read fails to check seek
[ https://issues.apache.org/jira/browse/HADOOP-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-8311. - Resolution: Invalid > FSInputStream's positioned read fails to check seek > --- > > Key: HADOOP-8311 > URL: https://issues.apache.org/jira/browse/HADOOP-8311 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 0.23.3, 0.24.0, 2.0.0 > Reporter: Daryn Sharp > > {{FSInputStream#read(long, byte[], int, int)}} will seek into the input > stream, but does not check that the seek actually succeeded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8334) HttpServer sometimes returns incorrect port
Daryn Sharp created HADOOP-8334: --- Summary: HttpServer sometimes returns incorrect port Key: HADOOP-8334 URL: https://issues.apache.org/jira/browse/HADOOP-8334 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.3, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp {{HttpServer}} is not always returning the correct listening port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8335) Improve Configuration's address handling
Daryn Sharp created HADOOP-8335: --- Summary: Improve Configuration's address handling Key: HADOOP-8335 URL: https://issues.apache.org/jira/browse/HADOOP-8335 Project: Hadoop Common Issue Type: Improvement Components: util Affects Versions: 0.23.3, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp There's a {{Configuration#getSocketAddr}} but no symmetrical {{setSocketAddr}}. An {{updateSocketAddr}} would also be very handy for yarn's updating of wildcard addresses in the config. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8373) Port RPC.getServerAddress to 0.23
Daryn Sharp created HADOOP-8373: --- Summary: Port RPC.getServerAddress to 0.23 Key: HADOOP-8373 URL: https://issues.apache.org/jira/browse/HADOOP-8373 Project: Hadoop Common Issue Type: Improvement Components: ipc Affects Versions: 0.23.3 Reporter: Daryn Sharp Assignee: Daryn Sharp {{RPC.getServerAddress}} was introduced in trunk/2.0 as part of larger HA changes. 0.23 does not have HA, but this non-HA specific method is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8422) FileSystem#getDefaultBlockSize and Replication don't use the given path
[ https://issues.apache.org/jira/browse/HADOOP-8422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-8422. - Resolution: Not A Problem > FileSystem#getDefaultBlockSize and Replication don't use the given path > --- > > Key: HADOOP-8422 > URL: https://issues.apache.org/jira/browse/HADOOP-8422 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.0.0 >Reporter: Eli Collins >Priority: Minor > > The javadocs for FileSystem#getDefaultBlockSize and > FileSystem#getDefaultReplication claim that "The given path will be used to > locate the actual filesystem" however they both ignore the path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-8422) FileSystem#getDefaultBlockSize and Replication don't use the given path
[ https://issues.apache.org/jira/browse/HADOOP-8422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp reopened HADOOP-8422: - Re-opening and re-targeting to 1.x for a backport of replication factor and block size fixes for {{ViewFileSystem}}. > FileSystem#getDefaultBlockSize and Replication don't use the given path > --- > > Key: HADOOP-8422 > URL: https://issues.apache.org/jira/browse/HADOOP-8422 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 1.0.3 >Reporter: Eli Collins >Priority: Minor > > The javadocs for FileSystem#getDefaultBlockSize and > FileSystem#getDefaultReplication claim that "The given path will be used to > locate the actual filesystem" however they both ignore the path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8490) Add Configuration to FileSystem cache key
Daryn Sharp created HADOOP-8490: --- Summary: Add Configuration to FileSystem cache key Key: HADOOP-8490 URL: https://issues.apache.org/jira/browse/HADOOP-8490 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The {{FileSystem#get(URI, Configuration}} does not take the given {{Configuration}} into consideration before returning an existing fs instance from the cache with a possibly different conf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8517) --config option does not work with Hadoop installation on Windows
[ https://issues.apache.org/jira/browse/HADOOP-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-8517. - Resolution: Not A Problem > --config option does not work with Hadoop installation on Windows > - > > Key: HADOOP-8517 > URL: https://issues.apache.org/jira/browse/HADOOP-8517 > Project: Hadoop Common > Issue Type: Bug >Reporter: Trupti Dhavle > > I ran following command > hadoop --config c:\\hadoop\conf fs -ls / > I get following error for --config option > Unrecognized option: --config > Could not create the Java virtual machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8606) FileSystem.get may return the wrong filesystem
Daryn Sharp created HADOOP-8606: --- Summary: FileSystem.get may return the wrong filesystem Key: HADOOP-8606 URL: https://issues.apache.org/jira/browse/HADOOP-8606 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 1.0.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp {{FileSystem.get(URI, conf)}} will return the default fs if the scheme is null, regardless of whether the authority is null too. This causes URIs of "//authority/path" to _always_ refer to "/path" on the default fs. To the user, this appears to "work" if the authority in the null-scheme URI matches the authority of the default fs. When the authorities don't match, the user is very surprised that the default fs is used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-8577) The RPC must have failed proxyUser (auth:SIMPLE) via realus...@hadoop.apache.org (auth:SIMPLE)
[ https://issues.apache.org/jira/browse/HADOOP-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp reopened HADOOP-8577: - > The RPC must have failed proxyUser (auth:SIMPLE) via > realus...@hadoop.apache.org (auth:SIMPLE) > -- > > Key: HADOOP-8577 > URL: https://issues.apache.org/jira/browse/HADOOP-8577 > Project: Hadoop Common > Issue Type: Bug > Components: test > Environment: Ubuntu 11 > JDK 1.7 > Maven 3.0.4 >Reporter: chandrashekhar Kotekar >Priority: Minor > Original Estimate: 12h > Remaining Estimate: 12h > > Hi, > I have downloaded maven source code today itself and tried test it. I did > following steps : > 1) mvn clean > 2) mvn compile > 3) mvn test > After 3rd step one step failed. Stack trace of failed test is as follows : > Failed tests: > testRealUserIPNotSpecified(org.apache.hadoop.security.TestDoAsEffectiveUser): > The RPC must have failed proxyUser (auth:SIMPLE) via > realus...@hadoop.apache.org (auth:SIMPLE) > testWithDirStringAndConf(org.apache.hadoop.fs.shell.TestPathData): checking > exist > testPartialAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > testFullAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host/file, expected: myfs://host.a.b> > > testShortAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testPartialAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > testShortAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testIpAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testAuthorityFromDefaultFS(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testFullAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host/file, expected: myfs://host.a.b:123> > > testShortAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testPartialAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testFullAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host:456/file, expected: myfs://host.a.b:456> > testIpAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testIpAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > Tests in error: > testUnqualifiedUriContents(org.apache.hadoop.fs.shell.TestPathData): `d1': > No such file or directory > I am newbie in Hadoop source code world. Please help me in building hadoop > source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-8577) The RPC must have failed proxyUser (auth:SIMPLE) via realus...@hadoop.apache.org (auth:SIMPLE)
[ https://issues.apache.org/jira/browse/HADOOP-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-8577. - Resolution: Duplicate > The RPC must have failed proxyUser (auth:SIMPLE) via > realus...@hadoop.apache.org (auth:SIMPLE) > -- > > Key: HADOOP-8577 > URL: https://issues.apache.org/jira/browse/HADOOP-8577 > Project: Hadoop Common > Issue Type: Bug > Components: test > Environment: Ubuntu 11 > JDK 1.7 > Maven 3.0.4 >Reporter: chandrashekhar Kotekar >Priority: Minor > Original Estimate: 12h > Remaining Estimate: 12h > > Hi, > I have downloaded maven source code today itself and tried test it. I did > following steps : > 1) mvn clean > 2) mvn compile > 3) mvn test > After 3rd step one step failed. Stack trace of failed test is as follows : > Failed tests: > testRealUserIPNotSpecified(org.apache.hadoop.security.TestDoAsEffectiveUser): > The RPC must have failed proxyUser (auth:SIMPLE) via > realus...@hadoop.apache.org (auth:SIMPLE) > testWithDirStringAndConf(org.apache.hadoop.fs.shell.TestPathData): checking > exist > testPartialAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > testFullAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host/file, expected: myfs://host.a.b> > > testShortAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testPartialAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > testShortAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testIpAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testAuthorityFromDefaultFS(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testFullAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host/file, expected: myfs://host.a.b:123> > > testShortAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testPartialAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testFullAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: myfs://host:456/file, expected: myfs://host.a.b:456> > testIpAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > > testIpAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization): > expected: but was: > Tests in error: > testUnqualifiedUriContents(org.apache.hadoop.fs.shell.TestPathData): `d1': > No such file or directory > I am newbie in Hadoop source code world. Please help me in building hadoop > source code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8613) AbstractDelegationTokenIdentifier#getUser() should set token auth type
Daryn Sharp created HADOOP-8613: --- Summary: AbstractDelegationTokenIdentifier#getUser() should set token auth type Key: HADOOP-8613 URL: https://issues.apache.org/jira/browse/HADOOP-8613 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.0.0-alpha, 0.23.0, 1.0.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{AbstractDelegationTokenIdentifier#getUser()}} returns the UGI associated with a token. The UGI's auth type will either be SIMPLE for non-proxy tokens, or PROXY (effective user) and SIMPLE (real user). Instead of SIMPLE, it needs to be TOKEN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8627) FS deleteOnExit may delete the wrong path
Daryn Sharp created HADOOP-8627: --- Summary: FS deleteOnExit may delete the wrong path Key: HADOOP-8627 URL: https://issues.apache.org/jira/browse/HADOOP-8627 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{FilterFileSystem}} is incorrectly delegating {{deleteOnExit}} to the raw underlying fs. This is wrong, because each fs instance is intended to maintain its own pool of temp files. Worse yet, this means registering a file via {{ChRootedFileSystem#deleteOnExit}} will delete the file w/o the chroot path prepended! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8633) Interrupted FsShell copies may leave tmp files
Daryn Sharp created HADOOP-8633: --- Summary: Interrupted FsShell copies may leave tmp files Key: HADOOP-8633 URL: https://issues.apache.org/jira/browse/HADOOP-8633 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Interrupting a copy, ex. via SIGINT, may cause tmp files to not be removed. If the user is copying large files then the remnants will eat into the user's quota. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8634) Ensure FileSystem#close doesn't squawk for deleteOnExit paths
Daryn Sharp created HADOOP-8634: --- Summary: Ensure FileSystem#close doesn't squawk for deleteOnExit paths Key: HADOOP-8634 URL: https://issues.apache.org/jira/browse/HADOOP-8634 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.0, 3.0.0, 2.2.0-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{FileSystem#deleteOnExit}} doesn't check if the path exists before attempting to delete. Errors may cause unnecessary INFO log squawks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8635) Cannot cancel paths registered deleteOnExit
Daryn Sharp created HADOOP-8635: --- Summary: Cannot cancel paths registered deleteOnExit Key: HADOOP-8635 URL: https://issues.apache.org/jira/browse/HADOOP-8635 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{FileSystem#deleteOnExit}} does not have a symmetric method to unregister files. Since it's used to register temporary files during a copy operation, this can lead to a lot of unnecessary rpc operations for files successfully copied when the {{FileSystem}} is closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8637) FilterFileSystem#setWriteChecksum is broken
Daryn Sharp created HADOOP-8637: --- Summary: FilterFileSystem#setWriteChecksum is broken Key: HADOOP-8637 URL: https://issues.apache.org/jira/browse/HADOOP-8637 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{FilterFileSystem#setWriteChecksum}} is being passed through as {{fs.setVERIFYChecksum}}. Example of impact is checksums cannot be disabled for LFS if a filter fs (like {{ChRootedFileSystem}}) is applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8701) Reduce visibility of getDelegationToken
Daryn Sharp created HADOOP-8701: --- Summary: Reduce visibility of getDelegationToken Key: HADOOP-8701 URL: https://issues.apache.org/jira/browse/HADOOP-8701 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp {{FileSystem#getDelegationToken}} is incompatible with a multi-token fs like viewfs. {{FileSystem#addDelegationTokens}} is being added in HADOOP-7967 to call {{getDelegationToken}} on each of the fs mounts. The visibility of {{getDelegationToken}} must be reduced to protected since it's completely incompatible with a multi-token fs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8702) Port HADOOP-7967 to FileContext/AbstractFileSystem
Daryn Sharp created HADOOP-8702: --- Summary: Port HADOOP-7967 to FileContext/AbstractFileSystem Key: HADOOP-8702 URL: https://issues.apache.org/jira/browse/HADOOP-8702 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Need to add generalized multi-token fs support to FC/AFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8725) MR is broken when security is off
Daryn Sharp created HADOOP-8725: --- Summary: MR is broken when security is off Key: HADOOP-8725 URL: https://issues.apache.org/jira/browse/HADOOP-8725 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker HADOOP-8225 broke MR when security is off. MR was changed to stop re-reading the credentials that UGI had already read, and to stop putting those tokens back into the UGI where they already were. UGI only reads a credentials file when security is enabled, but MR uses tokens (ie. job token) even when security is disabled... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8772) RawLocalFileStatus shells out for permission info
Daryn Sharp created HADOOP-8772: --- Summary: RawLocalFileStatus shells out for permission info Key: HADOOP-8772 URL: https://issues.apache.org/jira/browse/HADOOP-8772 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha Reporter: Daryn Sharp Priority: Critical {{RawLocalFileStatus}} shells out to run "ls" to get permissions info. This very inefficient. More importantly, mixing mulithreading and forking is risky. Some version of glibc in RHEL will deadlock things such as "__GI__IO_list_lock" and "malloc_atfork". All this unnecessary shelling out to access the local filesystem greatly increases the risk of deadlock. Namely, the NM's user localizer is seen to jam more frequently than the TT & NM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8779) Use tokens regardless of authentication type
Daryn Sharp created HADOOP-8779: --- Summary: Use tokens regardless of authentication type Key: HADOOP-8779 URL: https://issues.apache.org/jira/browse/HADOOP-8779 Project: Hadoop Common Issue Type: New Feature Components: fs, security Affects Versions: 3.0.0, 2.2.0-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp Security is a combination of authentication and authorization (tokens). Authorization may be granted independently of the authentication model. Tokens should be used regardless of simple or kerberos authentication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8783) Improve RPC.Server
Daryn Sharp created HADOOP-8783: --- Summary: Improve RPC.Server Key: HADOOP-8783 URL: https://issues.apache.org/jira/browse/HADOOP-8783 Project: Hadoop Common Issue Type: Sub-task Components: ipc, security Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8784) Improve IPC.Client's token use
Daryn Sharp created HADOOP-8784: --- Summary: Improve IPC.Client's token use Key: HADOOP-8784 URL: https://issues.apache.org/jira/browse/HADOOP-8784 Project: Hadoop Common Issue Type: Sub-task Components: ipc, security Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp If present, tokens should be sent for all auth types including simple auth. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8785) Remove security conditionals from FileSystems
Daryn Sharp created HADOOP-8785: --- Summary: Remove security conditionals from FileSystems Key: HADOOP-8785 URL: https://issues.apache.org/jira/browse/HADOOP-8785 Project: Hadoop Common Issue Type: Sub-task Components: fs, security Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Streamline {{FileSystem}} classes by removing {{UGI.isSecurityEnabled()}} conditionals. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8906) paths with multiple globs are unreliable
Daryn Sharp created HADOOP-8906: --- Summary: paths with multiple globs are unreliable Key: HADOOP-8906 URL: https://issues.apache.org/jira/browse/HADOOP-8906 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Priority: Critical Let's say we have have a structure of "$date/$user/stuff/file". Multiple globs are unreliable unless every directory in the structure exists. These work: date*/user date*/user/stuff date*/user/stuff/file These fail: date*/user/* date*/user/*/* date*/user/stu* date*/user/stu*/* date*/user/stu*/file date*/user/stuff/* date*/user/stuff/f* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-7973) DistributedFileSystem close has severe consequences
[ https://issues.apache.org/jira/browse/HADOOP-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-7973. - Resolution: Won't Fix All branches have this behavior, but a fix has proved to be too contentious. > DistributedFileSystem close has severe consequences > --- > > Key: HADOOP-7973 > URL: https://issues.apache.org/jira/browse/HADOOP-7973 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 1.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-7973-2.patch, HADOOP-7973-3.patch, > HADOOP-7973-4.patch, HADOOP-7973-5.patch, HADOOP-7973.patch > > > The way {{FileSystem#close}} works is very problematic. Since the > {{FileSystems}} are cached, any {{close}} by any caller will cause problems > for every other reference to it. Will add more detail in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8965) Allow client to specify internal authentication
Daryn Sharp created HADOOP-8965: --- Summary: Allow client to specify internal authentication Key: HADOOP-8965 URL: https://issues.apache.org/jira/browse/HADOOP-8965 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 2.0.0-alpha, 0.23.0, 1.0.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The RPC client currently uses a token if present, else it falls back to authentication. This creates an ambiguity in the client if SIMPLE auth is allowed to use tokens. A task will continue to run if the task loses its tokens because it will fallback to SIMPLE auth - this would be a bug. There should be a means to specify that tasks must use tokens to avoid the ambiguity. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8991) Add support for SASL PLAIN authentication
Daryn Sharp created HADOOP-8991: --- Summary: Add support for SASL PLAIN authentication Key: HADOOP-8991 URL: https://issues.apache.org/jira/browse/HADOOP-8991 Project: Hadoop Common Issue Type: Sub-task Components: ipc Reporter: Daryn Sharp Assignee: Daryn Sharp Adding SASL PLAIN auth will allow {{isSecurityEnabled}} to mean SASL is active instead of only meaning kerberos is enabled. SASL will always require tokens. PLAIN will be the same as SIMPLE, with the added requirement of tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8999) SASL negotiation is flawed
Daryn Sharp created HADOOP-8999: --- Summary: SASL negotiation is flawed Key: HADOOP-8999 URL: https://issues.apache.org/jira/browse/HADOOP-8999 Project: Hadoop Common Issue Type: Sub-task Components: ipc Reporter: Daryn Sharp Assignee: Daryn Sharp The RPC protocol used for SASL negotiation is flawed. The server's RPC response contains the next SASL challenge token, but a SASL server can return null (I'm done) or a N-many byte challenge. The server currently will not send a RPC success response to the client if the SASL server returns null, which causes the client to hang until it times out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9009) Add SecurityUtil methods to get/set authentication method
Daryn Sharp created HADOOP-9009: --- Summary: Add SecurityUtil methods to get/set authentication method Key: HADOOP-9009 URL: https://issues.apache.org/jira/browse/HADOOP-9009 Project: Hadoop Common Issue Type: Sub-task Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The authentication method is handled as a string when an enum is available. Adding methods to get/set the conf value based on the enum will simplify adding new SASL auths such as PLAIN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9010) Map UGI authenticationMethod to RPC authMethod
Daryn Sharp created HADOOP-9010: --- Summary: Map UGI authenticationMethod to RPC authMethod Key: HADOOP-9010 URL: https://issues.apache.org/jira/browse/HADOOP-9010 Project: Hadoop Common Issue Type: Sub-task Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp The UGI's authenticationMethod needs a forward mapping to the RPC/SASL authMethod. This will allow for the RPC client to eventually use the UGI's authenticationMethod to derive the authMethod instead of assuming security on is kerberos and security off is simple. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9012) IPC Client sends wrong connection context
Daryn Sharp created HADOOP-9012: --- Summary: IPC Client sends wrong connection context Key: HADOOP-9012 URL: https://issues.apache.org/jira/browse/HADOOP-9012 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 3.0.0, 2.0.3-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp The IPC client will send the wrong connection context when asked to switch to simple auth. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9013) UGI should not hardcode loginUser's authenticationType
Daryn Sharp created HADOOP-9013: --- Summary: UGI should not hardcode loginUser's authenticationType Key: HADOOP-9013 URL: https://issues.apache.org/jira/browse/HADOOP-9013 Project: Hadoop Common Issue Type: Sub-task Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp {{UGI.loginUser}} assumes that the user's auth type for security on = kerberos, security off = simple. It should instead use the configured auth type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9014) Standardize creation of SaslRpcClients
Daryn Sharp created HADOOP-9014: --- Summary: Standardize creation of SaslRpcClients Key: HADOOP-9014 URL: https://issues.apache.org/jira/browse/HADOOP-9014 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp To ease adding additional SASL support, need to change the chained conditionals into a switch and make one standard call to createSaslClient. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9015) Standardize creation of SaslRpcServers
Daryn Sharp created HADOOP-9015: --- Summary: Standardize creation of SaslRpcServers Key: HADOOP-9015 URL: https://issues.apache.org/jira/browse/HADOOP-9015 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp To ease adding additional SASL support, need to merge the multiple switches for mechanism type and server creation into a single switch with a single call to createSaslServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9020) Add a SASL PLAIN server
Daryn Sharp created HADOOP-9020: --- Summary: Add a SASL PLAIN server Key: HADOOP-9020 URL: https://issues.apache.org/jira/browse/HADOOP-9020 Project: Hadoop Common Issue Type: Sub-task Components: ipc, security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Java includes a SASL PLAIN client but not a server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9021) Enforce configured SASL method on the server
Daryn Sharp created HADOOP-9021: --- Summary: Enforce configured SASL method on the server Key: HADOOP-9021 URL: https://issues.apache.org/jira/browse/HADOOP-9021 Project: Hadoop Common Issue Type: Sub-task Components: ipc, security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The RPC needs to restrict itself to only using the configured SASL method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9034) SASL negotiation is insufficient to support all types
Daryn Sharp created HADOOP-9034: --- Summary: SASL negotiation is insufficient to support all types Key: HADOOP-9034 URL: https://issues.apache.org/jira/browse/HADOOP-9034 Project: Hadoop Common Issue Type: Bug Components: ipc, security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp A SASL negotiation requires a series of 1 or more challenge/responses. The current server-side RPC SASL implementation may respond with another challenge, an exception, or a switch to simple method. The server does not reply when the authentication handshake is complete. For SASL mechanisms that require multiple exchanges before the client believes the authentication is complete, the client has an opportunity to read the exception or switch to simple. However some mechanisms, ex. PLAIN, consider the exchange complete as soon as it sends the initial response. The following proxy call will read the SASL response and throw an incomplete protobuf exception. The same issue may manifest when a client sends the final response for a multi-exchange mechanism and the server returns an exception. Fixing the problem requires breaking RPC compatibility. We should consider having the SASL server always return success when authentication is complete. HADOOP-8999 added a short-term workaround to send a success response only for PLAIN, and for the client to always read at least one RPC response to ensure PLAIN will work. Another complication is a SASL server returns non-null when initiating another challenge and null when authentication is established. However, the current RPC exchange does not allow a zero-byte response ("client, you initiate the exchange") to be differentiated from a null ("client, we're authenticated!"). We should consider using a different RPC status to indicate SASL authentication is in progress, so a zero-byte RPC success is interpreted as authentication is complete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9035) Generalize setup of LoginContext
Daryn Sharp created HADOOP-9035: --- Summary: Generalize setup of LoginContext Key: HADOOP-9035 URL: https://issues.apache.org/jira/browse/HADOOP-9035 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The creation of the {{LoginContext}} in {{UserGroupInformation}} has specific cases for specific authentication types. This is inflexible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9070) Kerberos SASL server cannot find kerberos key
Daryn Sharp created HADOOP-9070: --- Summary: Kerberos SASL server cannot find kerberos key Key: HADOOP-9070 URL: https://issues.apache.org/jira/browse/HADOOP-9070 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 3.0.0, 2.0.3-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker HADOOP-9015 inadvertently removed a {{doAs}} block around instantiation of the sasl server which renders a server incapable of accepting kerberized connections. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9105) FsShell -moreFromLocal erroneously fails
Daryn Sharp created HADOOP-9105: --- Summary: FsShell -moreFromLocal erroneously fails Key: HADOOP-9105 URL: https://issues.apache.org/jira/browse/HADOOP-9105 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The move successfully completes, but then reports error upon trying to delete the local source directory even though it succeeded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9161) FileSystem.moveFromLocalFile fails to remove source
Daryn Sharp created HADOOP-9161: --- Summary: FileSystem.moveFromLocalFile fails to remove source Key: HADOOP-9161 URL: https://issues.apache.org/jira/browse/HADOOP-9161 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp FileSystem.moveFromLocalFile fails with cannot remove file:/path after copying the files. It appears to be trying to remove a file uri as a relative path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9238) FsShell -put from stdin auto-creates paths
Daryn Sharp created HADOOP-9238: --- Summary: FsShell -put from stdin auto-creates paths Key: HADOOP-9238 URL: https://issues.apache.org/jira/browse/HADOOP-9238 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp FsShell put is no longer supposed to auto-create paths. There's an inconsistency where a put from stdin will still auto-create paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9284) Authentication method is wrong if no TGT is present
Daryn Sharp created HADOOP-9284: --- Summary: Authentication method is wrong if no TGT is present Key: HADOOP-9284 URL: https://issues.apache.org/jira/browse/HADOOP-9284 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp If security is enabled, {{UGI.getLoginUser()}} will attempt an os-specific login followed by looking for a TGT in the ticket cache. If no TGT is found, the UGI's authentication method is still set as KERBEROS instead of SIMPLE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9289) FsShell rm -f fails for non-matching globs
Daryn Sharp created HADOOP-9289: --- Summary: FsShell rm -f fails for non-matching globs Key: HADOOP-9289 URL: https://issues.apache.org/jira/browse/HADOOP-9289 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Rm -f isn't supposed to error for paths that don't exist. It works as expected for exact paths, but fails for non-matching globs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9317) User cannot specify a kerberos keytab for commands
Daryn Sharp created HADOOP-9317: --- Summary: User cannot specify a kerberos keytab for commands Key: HADOOP-9317 URL: https://issues.apache.org/jira/browse/HADOOP-9317 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical {{UserGroupInformation}} only allows kerberos users to be logged in via the ticket cache when running hadoop commands. {{UGI}} allows a keytab to be used, but it's only exposed programatically. This forces keytab-based users running hadoop commands to periodically issue a kinit from the keytab. A race condition exists during the kinit when the ticket cache is deleted and re-created. Hadoop commands will fail when the ticket cache does not momentarily exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9336) Allow UGI of current connection to be queried
Daryn Sharp created HADOOP-9336: --- Summary: Allow UGI of current connection to be queried Key: HADOOP-9336 URL: https://issues.apache.org/jira/browse/HADOOP-9336 Project: Hadoop Common Issue Type: Improvement Components: ipc Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Querying {{UGI.getCurrentUser}} is synch'ed and inefficient for short-lived RPC requests. Since the connection already contains the UGI, there should be a means to query it directly and avoid a call to {{UGI.getCurrentUser}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9339) IPC.Server incorrectly sets UGI auth type
Daryn Sharp created HADOOP-9339: --- Summary: IPC.Server incorrectly sets UGI auth type Key: HADOOP-9339 URL: https://issues.apache.org/jira/browse/HADOOP-9339 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp For non-secure servers, {{IPC.Server#processConnectionContext}} will explicitly set the UGI's auth type to SIMPLE. However the auth type has already been set by this point, and this explicit set causes proxy UGIs to be SIMPLE/SIMPLE instead of PROXY/SIMPLE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9341) Secret Managers should allow explicit purging of tokens and secret keys
Daryn Sharp created HADOOP-9341: --- Summary: Secret Managers should allow explicit purging of tokens and secret keys Key: HADOOP-9341 URL: https://issues.apache.org/jira/browse/HADOOP-9341 Project: Hadoop Common Issue Type: New Feature Components: security Affects Versions: 2.0.0-alpha, 3.0.0, 0.23.7 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Per HDFS-4477, the fsimage retains all secret keys and uncanceled tokens forever. There should be a way to explicitly purge a secret manager of expired items w/o starting its threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9352) Expose UGI.setLoginUser for tests
Daryn Sharp created HADOOP-9352: --- Summary: Expose UGI.setLoginUser for tests Key: HADOOP-9352 URL: https://issues.apache.org/jira/browse/HADOOP-9352 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The {{UGI.setLoginUser}} method is not publicly exposed, which makes it impossible to correctly test code executed outside of an explicit {{doAs}}. {{getCurrentUser}}/{{getLoginUser}} will always vivify the login user from the user running the test, and not an arbitrary user to be determined by the test. The method is documented with why it's not ready for prime-time, but it's good enough for tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9363) AuthenticatedURL will NPE if server closes connection
Daryn Sharp created HADOOP-9363: --- Summary: AuthenticatedURL will NPE if server closes connection Key: HADOOP-9363 URL: https://issues.apache.org/jira/browse/HADOOP-9363 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp A NPE occurs if the server unexpectedly closes the connection for an {{AuthenticatedURL}} w/o sending a response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9366) AuthenticatedURL.Token has a mutable hashCode
Daryn Sharp created HADOOP-9366: --- Summary: AuthenticatedURL.Token has a mutable hashCode Key: HADOOP-9366 URL: https://issues.apache.org/jira/browse/HADOOP-9366 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Hash codes must be immutable, but {{AuthenticatedURL.Token#hashCode}} is not. It will return 0 if the token is not set, else the token's hash code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9374) Add tokens from -tokenCacheFile into UGI
Daryn Sharp created HADOOP-9374: --- Summary: Add tokens from -tokenCacheFile into UGI Key: HADOOP-9374 URL: https://issues.apache.org/jira/browse/HADOOP-9374 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 2.0.0-alpha, 3.0.0, 0.23.7 Reporter: Daryn Sharp Assignee: Daryn Sharp {{GenericOptionsParser}} accepts a {{-tokenCacheFile}} option. However, it only sets the {{mapreduce.job.credentials.json}} conf value instead of also adding the tokens to the UGI so they are usable by the command being executed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-9507) LocalFileSystem rename() is broken in some cases when destination exists
[ https://issues.apache.org/jira/browse/HADOOP-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-9507. - Resolution: Invalid > LocalFileSystem rename() is broken in some cases when destination exists > > > Key: HADOOP-9507 > URL: https://issues.apache.org/jira/browse/HADOOP-9507 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Reporter: Mostafa Elhemali >Assignee: Mostafa Elhemali >Priority: Minor > Attachments: HADOOP-9507.branch-1-win.patch > > > The rename() method in RawLocalFileSystem uses FileUtil.copy() without > realizing that FileUtil.copy() has a special behavior that if you're copying > /foo to /bar and /bar exists and is a directory, it'll copy /foo inside /bar > instead of overwriting it, which is not what rename() wants. So you end up > with weird behaviors like in this repro: > {code} > c: > cd \ > md Foo > md Bar > md Foo\X > md Bar\X > hadoop fs -mv file:///c:/Foo file:///c:/Bar > {code} > At the end of this, you would expect to find only Bar\X, but you instead find > Bar\X\X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9516) Enable spnego filters only if kerberos is enabled
Daryn Sharp created HADOOP-9516: --- Summary: Enable spnego filters only if kerberos is enabled Key: HADOOP-9516 URL: https://issues.apache.org/jira/browse/HADOOP-9516 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Spnego filters are currently enabled if security is enabled - which is predicated on security=kerberos. With the advent of the PLAIN authentication method, the filters should only be enabled if kerberos is enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9645) KerberosAuthenticator NPEs on connect error
Daryn Sharp created HADOOP-9645: --- Summary: KerberosAuthenticator NPEs on connect error Key: HADOOP-9645 URL: https://issues.apache.org/jira/browse/HADOOP-9645 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.5-alpha Reporter: Daryn Sharp Priority: Critical A NPE occurs if there's a kerberos error during initial connect. In this case, the NN was using a HTTP service principal with a stale kvno. It causes webhdfs to fail in a non-user friendly manner by masking the real error from the user. {noformat} java.lang.RuntimeException: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1241) at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2713) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:477) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.isNegotiate(KerberosAuthenticator.java:164) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:140) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:217) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.openHttpUrlConnection(WebHdfsFileSystem.java:364) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HADOOP-7175) Add isEnabled() to Trash
Add isEnabled() to Trash Key: HADOOP-7175 URL: https://issues.apache.org/jira/browse/HADOOP-7175 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.22.0 Reporter: Daryn Sharp Fix For: 0.22.0 The moveToTrash method returns false in a number of cases. It's not possible to discern if false means an error occurred. In particular, it's not possible to know if the trash is disabled vs. an error occurred. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HADOOP-7176) Redesign FsShell
Redesign FsShell Key: HADOOP-7176 URL: https://issues.apache.org/jira/browse/HADOOP-7176 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.22.0 Reporter: Daryn Sharp Fix For: 0.22.0 The FsShell commands are very tightly coupled together which makes it necessarily hard to write new commands. There is a lot of redundancy between the commands, inconsistencies in handling of paths, handling of errors, and correct return of exit codes. The FsShell commands should be subclasses of the fs.shell.Command class which is already being used by the -count command, and is used by other commands like dfsadmin. This will serve as an umbrella bug to track the changes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HADOOP-7180) Improve CommandFormat
Improve CommandFormat - Key: HADOOP-7180 URL: https://issues.apache.org/jira/browse/HADOOP-7180 Project: Hadoop Common Issue Type: Improvement Reporter: Daryn Sharp Fix For: 0.22.0 CommandFormat currently takes an array and offset for parsing and returns a list of arguments. It'd be much more convenient to have it process a list too. It would also be nice to differentiate between too few and too many args instead of the generic "Illegal number of arguments". Finally, CommandFormat is completely devoid of tests. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7202) Improve Command base class
Improve Command base class -- Key: HADOOP-7202 URL: https://issues.apache.org/jira/browse/HADOOP-7202 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Need to extend the Command base class to allow all command to easily subclass from a code set of code that correctly handles globs and exit codes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7205) automatically determine JAVA_HOME on OS X
automatically determine JAVA_HOME on OS X - Key: HADOOP-7205 URL: https://issues.apache.org/jira/browse/HADOOP-7205 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 OS X provides a java_home command that will return the user's selected jvm. The hadoop-env.sh should use this command if JAVA_HOME is not set. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-5983) Namenode shouldn't read mapred-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved HADOOP-5983. - Resolution: Cannot Reproduce > Namenode shouldn't read mapred-site.xml > --- > > Key: HADOOP-5983 > URL: https://issues.apache.org/jira/browse/HADOOP-5983 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 0.20.0 >Reporter: Rajiv Chittajallu >Assignee: Daryn Sharp > > The name node seem to read mapred-site.xml and fails if it can't parse it. > 2009-06-05 22:37:15,663 FATAL org.apache.hadoop.conf.Configuration: error > parsing conf file: org.xml.sax.SAXParseException: Error attempting to parse > XML file (href='/hadoop/conf/local/local-mapred-site.xml'). > 2009-06-05 22:37:15,664 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.RuntimeException: > org.xml.sax.SAXParseException: Error attempting to parse XML file > (href='/hadoop/conf/local/local-mapred-site.xml'). > In our config, local-mapred-site.xml is included only in mapred-site.xml > which we don't push to the namenode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7224) Add command factory to FsShell
Add command factory to FsShell -- Key: HADOOP-7224 URL: https://issues.apache.org/jira/browse/HADOOP-7224 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The FsShell has many chains if/then/else chains for instantiating and running commands. A dynamic mechanism is needed for registering commands such that FsShell requires no changes when adding new commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7230) Move -fs usage tests from hdfs into common
Move -fs usage tests from hdfs into common -- Key: HADOOP-7230 URL: https://issues.apache.org/jira/browse/HADOOP-7230 Project: Hadoop Common Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The -fs usage tests are in hdfs which causes an unnecessary synchronization of a common & hdfs bug when changing the text. The usages have no ties to hdfs, so they should be moved into common. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7231) Fix synopsis for -count
Fix synopsis for -count --- Key: HADOOP-7231 URL: https://issues.apache.org/jira/browse/HADOOP-7231 Project: Hadoop Common Issue Type: Bug Components: util Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The synopsis for the count command is wrong. 1) missing a space in "-count[-q]" 2) missing ellipsis for multiple path args -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7233) Refactor FsShell's ls
Refactor FsShell's ls - Key: HADOOP-7233 URL: https://issues.apache.org/jira/browse/HADOOP-7233 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor ls to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7234) FsShell tail doesn't handle globs
FsShell tail doesn't handle globs - Key: HADOOP-7234 URL: https://issues.apache.org/jira/browse/HADOOP-7234 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.0 Environment: The tail command doesn't bother trying to expand it's arguments which is inconsistent with other commands. Reporter: Daryn Sharp Assignee: Daryn Sharp -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7235) Refactor FsShell's tail
Refactor FsShell's tail --- Key: HADOOP-7235 URL: https://issues.apache.org/jira/browse/HADOOP-7235 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor tail to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7236) Refactor FsShell's mkdir
Refactor FsShell's mkdir Key: HADOOP-7236 URL: https://issues.apache.org/jira/browse/HADOOP-7236 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor tail to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7237) Refactor FsShell's touchz
Refactor FsShell's touchz - Key: HADOOP-7237 URL: https://issues.apache.org/jira/browse/HADOOP-7237 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor touchz to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7238) Refactor FsShell's cat & text
Refactor FsShell's cat & text - Key: HADOOP-7238 URL: https://issues.apache.org/jira/browse/HADOOP-7238 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor cat & text to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7249) Refactor FsShell's chmod/chown/chgrp
Refactor FsShell's chmod/chown/chgrp Key: HADOOP-7249 URL: https://issues.apache.org/jira/browse/HADOOP-7249 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor permissions commands to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7250) Refactor FsShell's setrep
Refactor FsShell's setrep - Key: HADOOP-7250 URL: https://issues.apache.org/jira/browse/HADOOP-7250 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor setrep to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7251) Refactor FsShell's getmerge
Refactor FsShell's getmerge --- Key: HADOOP-7251 URL: https://issues.apache.org/jira/browse/HADOOP-7251 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor getmerge to conform to new FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7265) Keep track of relative paths
Keep track of relative paths Key: HADOOP-7265 URL: https://issues.apache.org/jira/browse/HADOOP-7265 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp As part of the effort to standardize the display of paths, the PathData tracks the exact string used to create a path. When obtaining a directory's contents, the relative nature of the original path should be preserved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7267) Refactor FsShell's rm/rmr/expunge
Refactor FsShell's rm/rmr/expunge - Key: HADOOP-7267 URL: https://issues.apache.org/jira/browse/HADOOP-7267 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Refactor to conform to the FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7271) Standardize error messages
Standardize error messages -- Key: HADOOP-7271 URL: https://issues.apache.org/jira/browse/HADOOP-7271 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The FsShell commands have no standard format for the same error message. For instance, here is a snippet of the variations of just one of many error messages: cmd: $path: No such file or directory cmd: cannot stat `$path': No such file or directory cmd: Can not find listing for $path cmd: Cannot access $path: No such file or directory. cmd: No such file or directory `$path' cmd: File does not exist: $path cmd: File $path does not exist ... etc ... These need to be common. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7275) Refactor FsShell's stat
Refactor FsShell's stat --- Key: HADOOP-7275 URL: https://issues.apache.org/jira/browse/HADOOP-7275 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Refactor to conform to the FsCommand class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7285) Refactor FsShell's test
Refactor FsShell's test --- Key: HADOOP-7285 URL: https://issues.apache.org/jira/browse/HADOOP-7285 Project: Hadoop Common Issue Type: Improvement Components: fs Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Need to refactor to conform to FsCommand subclass. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-7286) Refactor FsShell's du/dus/df
Refactor FsShell's du/dus/df Key: HADOOP-7286 URL: https://issues.apache.org/jira/browse/HADOOP-7286 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to refactor to conform to FsCommand subclass. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira