Hello Alan,

I think this sounds like a reasonable approach.  I recommend that you file
a JIRA with the proposal (copy-paste the content of your email into a
comment) and then wait a few days before starting work in earnest to see
if anyone else wants to discuss it first.  I also recommend notifying
Colin Patrick McCabe on that JIRA.  It would be good to get a second
opinion from him, since he is the original author of much of this code.

--Chris Nauroth




On 9/30/15, 1:14 AM, "Alan Burlison" <alan.burli...@oracle.com> wrote:

>Now that the Hadoop native code builds on Solaris I've been chipping
>away at all the test failures. About 50% of the failures involve
>DomainSocket, either directly or indirectly. That seems to be mainly
>because the tests use DomainSocket to do single-node testing, whereas in
>production it seems that DomainSocket is less commonly used
>(https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/Sho
>rtCircuitLocalReads.html).
>
>The particular problem on Solaris is that socket read/write timeouts
>(the SO_SNDTIMEO and SO_RCVTIMEO socket options) are not supported for
>UNIX domain (PF_UNIX) sockets. Those options are however supported for
>PF_INET sockets. That's because the socket implementation on Solaris is
>split roughly into two parts, for inet sockets and for STREAMS sockets,
>and the STREAMS implementation lacks support for SO_SNDTIMEO and
>SO_RCVTIMEO. As an aside, performance of sockets that use loopback or
>the host's own IP is slightly better than that of UNIX domain sockets on
>Solaris.
>
>I'm investigating getting timeouts supported for PF_UNIX sockets added
>to Solaris, but in the meantime I'm also looking how this might be
>worked around in Hadoop. One way would be to implement timeouts by
>wrapping all the read/write/send/recv etc calls in DomainSocket.c with
>either poll() or select().
>
>The basic idea is to add two new fields to DomainSocket.c to hold the
>read/write timeouts. On platforms that support SO_SNDTIMEO and
>SO_RCVTIMEO these would be unused as setsockopt() would be used to set
>the socket timeouts. On platforms such as Solaris the JNI code would use
>the values to implement the timeouts appropriately.
>
>To prevent the code in DomainSocket.c becoming a #ifdef hairball, the
>current socket IO function calls such as accept(), send(), read() etc
>would be replaced with a macros such as HD_ACCEPT. On platforms that
>provide timeouts these would just expand to the normal socket functions,
>on platforms that don't support timeouts it would expand to wrappers
>that implements timeouts for them.
>
>The only caveats are that all code that does anything to a PF_UNIX
>socket would *always* have to do so via DomainSocket. As far as I can
>tell that's not an issue, but it would have to be borne in mind if any
>changes were made in this area.
>
>Before I set about doing this, does the approach seem reasonable?
>
>Thanks,
>
>-- 
>Alan Burlison
>--
>

Reply via email to