> On 5 Oct 2015, at 15:56, Alan Burlison <alan.burli...@oracle.com> wrote:
> 
> On 05/10/2015 15:14, Steve Loughran wrote:
> 
>> I don't think anyone would object for the changes, except for one big
>> caveat: a lot of us would like that binary file to be backwards
>> compatible; a Hadoop 2.6 JAR should be able to link to the 2.8+
>> libhadoop. So whatever gets changed, the old methods are still going
>> to hang around
> 
> That's not achievable as the method signatures need to change. Even though 
> they are private they need to change from static to normal methods and the 
> signatures need to change as well, as I said.

We've done it before, simply by retaining the older method entry points. Moving 
from static to instance-specific is a bigger change. If the old entry points 
are there and retained, even if all uses have been ripped out of the hadoop 
code, then the new methods will get used. It's just that old stuff will still 
link.  

> 
> JNI code is intimately  intertwined with the Java code it runs with. Running 
> mismatching Java & JNI versions is going to be a recipe for eventual disaster 
> as the JVM explicitly does *not* do any error checking between Java and JNI.

You mean jni code built for java7 isn't guaranteed to work on Java 8? If so, 
that's not something we knew of —and something to worry about.


> At some point some innocuous change will be made that will just cause 
> undefined behaviour.
> 
> I don't actually know how you'd get a JAR/JNI mismatch as they are built and 
> packaged together, so I'm struggling to understand what the potential issue 
> is here.

it arises whenever you try to deploy to YARN any application containing 
directly or indirectly (e.g. inside the spark-assembly JAR) the Hadoop java 
classes of a previous Java version. libhadoop is on the PATH of the far end, 
your app uploads their hadoop JARs, and the moment something tries to use the 
JNI-backed method you get to see a stack trace.

https://issues.apache.org/jira/browse/HADOOP-11064

if you look at the patch there, that's the kind of thing I'd like to see to 
address your solaris issues.

> 
> In any case the constraint you are requesting would flat-out preclude this 
> change, and would also mean that most of the other JNI changes that have been 
> committed recently would have to be ripped out as well . In summary, the 
> bridge is already burned.
> 

We've covered the bridge in petrol but not quite dropped a match on it.

HADOOP-11127, "Improve versioning and compatibility support in native library 
for downstream hadoop-common users." says "we need to do better here", which is 
probably some way of packaging native libs.

Now, if you look at our compatibility statement, we don't say anything about 
native binary linking:
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html

We have managed to avoid addressing this issue to date: the HADOOP-11064 
problem was caught before 2.6 shipped, and the patch put in without setting an 
immutable guarantee of compatibility going forward. We just don't want to light 
that bridge when a lot of users are on the other side of it.

-Steve

Reply via email to