Re: Putting the hdfs client as a separate jar

2014-04-07 Thread Haohui Mai
I filed HDFS-6200 to demonstrate the feasibility of the approach. ~Haohui On Fri, Apr 4, 2014 at 11:46 AM, Haohui Mai wrote: > I agree with Nicholas, Steve and Alejandro that it might require some > nontrivial to achieve the goal. Here is my high-level plan: > > 1. Create a new hdfs-client pac

Re: Putting the hdfs client as a separate jar

2014-04-04 Thread Haohui Mai
I agree with Nicholas, Steve and Alejandro that it might require some nontrivial to achieve the goal. Here is my high-level plan: 1. Create a new hdfs-client package, and gradually move classes from hdfs to hdfs-client. Fortunately IDEs like Eclipse and IntelliJ can do most of the heavy-liftings.

Re: Putting the hdfs client as a separate jar

2014-04-04 Thread Haohui Mai
Tuning the POM only mitigates the problem. The problem of one HDFS jar is that you can't rule out all unnecessary dependency. For example, NamenodeWebHdfsMethods depends on jersey-server and servlet. The Apache Falcon project has clients for HDFS, Hive, Pig, Oozie, thus it pulls in the dependency.

Re: Putting the hdfs client as a separate jar

2014-04-03 Thread Alejandro Abdelnur
Haouhi's suggestion of a hdfs-client JAR with client dependencies only, would be IMO the 'correct' way of doing things, we should have a hdfs-server and hdfs-client JARs. Doing this is practice is not trivial as classes are not properly segregated. So, Steven's suggestion of an hdfs-client seems

Re: Putting the hdfs client as a separate jar

2014-04-03 Thread Steve Loughran
to follow up with an example, JIRA on updating dependencies and tuning the POMs https://issues.apache.org/jira/browse/HADOOP-9991 here's a JIRA on dropping ZK from the hadoop-client POM https://issues.apache.org/jira/browse/HADOOP-9905 ​ And there's an mr-client POM where we've been slowly cut

Re: Putting the hdfs client as a separate jar

2014-04-03 Thread Steve Loughran
On 3 April 2014 00:02, Haohui Mai wrote: > The rpc and the web client can stay in one jar for the first cut. Indeed it > might introduce some extra dependency, but the downstream projects always > have the option to implement the webhdfs protocol themselves if they really > need to avoid the depe

Re: Putting the hdfs client as a separate jar

2014-04-03 Thread Steve Loughran
It's not an issue with hdfs/hadoop JARs itself, but the POMs -and the same problem exists with the hadoop core JAR - too much stuff you don't need client side. We can address this -without changing the packaging into an hdfs-client.jar (and so complicating everything related to HDFS code). All we

Re: Putting the hdfs client as a separate jar

2014-04-02 Thread Haohui Mai
The rpc and the web client can stay in one jar for the first cut. Indeed it might introduce some extra dependency, but the downstream projects always have the option to implement the webhdfs protocol themselves if they really need to avoid the dependency. Hadoop common is a bigger problem. Indeed

Re: Putting the hdfs client as a separate jar

2014-04-02 Thread Tsz Wo Sze
It is a very good idea although it might not be easy to do.  One aspect to consider is that do we need separated jars for rpc client and web client?  Now, suppose we could successfully separate HFDS Client jar(s) from HDFS.  However, HDFS Client uses Common as a library.  We have to separate Com