On 29 March 2014 02:41, Andrew Wang <andrew.w...@cloudera.com> wrote:
> We've successfully supported CDH4.2+ on Java 7 as well as CDH5, so I think > that code wise we're ready to move to 7. We're already supporting Java7 and openjdk7. OpenJDK doesn't seem any different in terms of reliability. One resource leak during stream closes (Y! patch in Hadoop 2.3+; already fixed in openjdk now). In exchange you get compressed pointers that work reliably, NUMA-aware heap allocation and GC (short-lived heap on local CPU, long-lived heap striped across sockets, GC threads work with their socket's RAM) 7u51 has broken a few things -not in Hadoop itself, but nearby: https://issues.apache.org/jira/browse/JCLOUDS-427 The fix for that went into Guava 16.x https://code.google.com/p/guava-libraries/issues/detail?id=1635 this shows a problem with Guava: fixes don't seem to get backported, updates drop old classes and methods. There's a risk that branch-2 will need to update Guava whether we want to or not -if some change in the OS or JVM forces it. > However, we can't start using Java > 7-only features in the 2.x line for compatibility reasons. > > +1 > Also, even if the compatibility guidelines state that we can bump our > dependencies whenever we want, practically speaking we can't in the 2.x > line without breaking existing applications. -1 We are already doing this for the non-traumatic artifacts (e.g SL4FJ), commons-io, commons-lang, commons-logging &c. Stuff with good backwards compatibility stories and no code changes. This lets YARN and client applications pick up more recent JARs (so helps their later dependencies work). Guice and Guava? Leave alone unless there's no choice. > I think the best we can do is > fix this for 3.x by shading everything so we don't hit this issue again. > > Andrew > -1 Shading stops you seeing what's in the classpath, and stops ops teams doing audits & control of what artifacts are on their hosts. What if someone really did need to find what had jetty 6 on their servers. Currently: find & grep. With shading -not easy at all. What we can do is OSGi-enable the Hadoop core binaries: https://issues.apache.org/jira/browse/HADOOP-7977?jql=project%20%3D%20HADOOP%20AND%20text%20~%20osgi Then we could think about what it would take to deploy YARN apps in an OSGi container, so that the AM & other containers come up with the YARN classpath, but only have access to those bits of it hadoop exports (org.apache.hadoop,*, core & site XML), and everything else gets explicitly loaded. AMs would need to tell YARN that they were OSGi apps and then (somehow) they'd get launched in an OSGi container, picking up the local hadoop binaries, but having their own versions of everything. Someone needs to volunteer to do all of that, of course. Patching the Hadoop JARs to work in OSGi would be a first step, and could be done on branch-2 without breaking existing code. -Steve -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.