On 29 March 2014 02:41, Andrew Wang <andrew.w...@cloudera.com> wrote:

> We've successfully supported CDH4.2+ on Java 7 as well as CDH5, so I think
> that code wise we're ready to move to 7.



We're already supporting Java7 and openjdk7. OpenJDK doesn't seem any
different in terms of reliability. One resource leak during stream closes
(Y! patch in Hadoop 2.3+; already fixed in openjdk now). In exchange you
get compressed pointers that work reliably, NUMA-aware heap allocation and
GC (short-lived heap on local CPU, long-lived heap striped across sockets,
GC threads work with their socket's RAM)

7u51 has broken a few things -not in Hadoop itself, but nearby:
https://issues.apache.org/jira/browse/JCLOUDS-427

The fix for that went into Guava 16.x
https://code.google.com/p/guava-libraries/issues/detail?id=1635

this shows a problem with Guava: fixes don't seem to get backported,
updates drop old classes and methods. There's a risk that branch-2 will
need to update Guava whether we want to or not -if some change in the OS or
JVM forces it.


> However, we can't start using Java
> 7-only features in the 2.x line for compatibility reasons.
>
>
+1


> Also, even if the compatibility guidelines state that we can bump our
> dependencies whenever we want, practically speaking we can't in the 2.x
> line without breaking existing applications.


-1

We are already doing this for the non-traumatic artifacts (e.g SL4FJ),
commons-io, commons-lang, commons-logging &c. Stuff with good backwards
compatibility stories and no code changes. This lets YARN and client
applications pick up more recent JARs (so helps their later dependencies
work).

Guice and Guava? Leave alone unless there's no choice.



> I think the best we can do is
> fix this for 3.x by shading everything so we don't hit this issue again.
>
> Andrew
>

-1

Shading stops you seeing what's in the classpath, and stops ops teams doing
audits & control of what artifacts are on their hosts. What if someone
really did need to find what had jetty 6 on their servers. Currently: find
& grep. With shading -not easy at all.

What we can do is OSGi-enable the Hadoop core binaries:
https://issues.apache.org/jira/browse/HADOOP-7977?jql=project%20%3D%20HADOOP%20AND%20text%20~%20osgi

Then we could think about what it would take to deploy YARN apps in an OSGi
container, so that the AM & other containers come up with the YARN
classpath, but only have access to those bits of it hadoop exports
(org.apache.hadoop,*, core & site XML), and everything else gets explicitly
loaded. AMs would need to tell YARN that they were OSGi apps and then
(somehow) they'd get launched in an OSGi container, picking up the local
hadoop binaries, but having their own versions of everything.

Someone needs to volunteer to do all of that, of course. Patching the
Hadoop JARs to work in OSGi would be a first step, and could be done on
branch-2 without breaking existing code.

-Steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to