On 06/10/2015 11:01, Steve Loughran wrote:

I really don't want to do that as it relegates Solaris to only ever
being a second-class citizen.

I know that Solaris matters to you 100%, and we've tried to be as
supportive as we can, even though it's not viewed as important to
anyone else. We don't want to make it 2nd class, just want to get it
to be 1st class in a way which doesn't create lots of compatibility
problems.

Yes you have been supportive, I recognise that and I'm grateful for it :-) Although I'm the main Solaris person that's visible, I'm not the only one who is interested. And I fully get the backwards compatibility thing, it's one of the main features of Solaris. However keeping backwards binary compatibility is something you really have to decide up front and design for, it's very difficult to add it as a constraint after the fact, as this scenario illustrates. And without internal or external library versioning support, its even harder still.

Is the per-socket timeout assumption used anywhere outside the
tests?

I've no real idea yet as I haven't yet got to the point where I have a 'Full Fat JNI' version of Hadoop on Solaris, I do know that around 50% of the ~200 test failures I'm seeing are most likely related to timeout handling, which is why I'm concentrating on it.

so we move from

function(fileHandle)

to function(Object), where object->fileHandle and object->timeout are both 
there?

To be precise, the signature change I have at the moment is (for example)

JNIEXPORT jint JNICALL
Java_org_apache_hadoop_net_unix_DomainSocket_accept0(
JNIEnv *env, jclass clazz, jint fd)

to

JNIEXPORT jint JNICALL
Java_org_apache_hadoop_net_unix_DomainSocket_accept0(
JNIEnv *env, jobject obj)

filehandle, readTimeout and writeTimeout are then accessed as members of the jobject.

what about

function(fileHandle, timeout)

where we retain

function(fileHandle) { return function(fileHandle, defaultTimeout)}?

And then never invoke it in our existing code, which now calls the new 
operation?
or if there's a call

setTimeout(fileHandle, timeout)

which for linux sets the socket timeout —and in solaris updates some
map handle->timeout used in the select() call.

Yes, I'd thought of that. The problem is the 'some map' bit. Maintaining that map would be clunky - file descriptor IDs are not going to be sequential and are reused so we'd have to store them in some sort of shadow data structure and track each and every close, and that's fiddly.

And the 'default timeout' option is I believe a non-starter, the default timeout is 2 minutes and many of the tests set it to a much shorter interval and expect it to time out at the specified time.

The problem is that if we store the timeout along the filehandle then we need access to an object pointer to retrieve it during the socket call. As the existing functions are static ones an object pointer isn't available.

I've looked long and hard at this, I have not come up with a mechanism that is both backwards binary compatible and not totally vile.

The other option is to effectively write a complete Solaris-only
replacement for DomainSocket, whether switching between that and the
current one is done at compile or run-time isn't really the point.
There's a fairly even split between the Java & JNI components of
DomainSocket, so whichever way it's done there will be significant
duplication of the overall logic and most likely code duplication.
That means that bug fixes in one place have to be exactly mirrored in
another, and that's unlikely to be sustainable.

It's not going to be maintained, or more precisely: it'll be broken
on a regular basis and you are the one left to handle it.

Exactly, which is why it is a non-starter. Whatever I do to fix this needs to be as minimal as possible and needs to disappear on platforms which don't need it.

Unfortunately I can't predict when that might happen by, though. In
my prototype it probes for working timeouts at configure time, so
when they do become available they'll be used automatically.

I agree that there is no formal libhadoop.so compatibility policy and
that is frustrating.  This has been an issue for those who want to run
jars compiled against multiple different versions of hadoop through
the same YARN instance.  We've discussed it in the past, but never
really come up with a great solution.  The best approach really would
be to bundle libhadoop.so inside the hadoop jar files, so that it
could be integral to the Hadoop version itself.  However, nobody has
done the work to make that happen.  The second-best approach would be
to include the Hadoop version in the libhadoop name itself (so we'd
have libhadoop28.so for hadoop 2.8, and so forth.)  Anyway, I think we
can solve this particular issue without going down that rathole...

Unfortunately I don't think we can, not without further complicating the existing complicated code with a lot of scaffolding.

I don't understand how YARN & multiple Hadoop versions interact, but if they are all in the same JVM instance then no amount of fiddling with shared objects will help as you can't have multiple SOs providing the same APIs within the same process - or at least not without a lot of complicated, fragile and utterly platform-specific configuration and code.

--
Alan Burlison
--

Reply via email to