Re: DomainSocket issues on Solaris

2015-10-08 Thread Chris Nauroth
Alan, thank you for picking up HADOOP-11127. I think it has needed a strong use case to kick it back into action, and maybe Solaris support is that use case. I'll join the discussion on the JIRA. --Chris Nauroth On 10/8/15, 9:40 AM, "Alan Burlison" wrote: >On 07/10/2015 22:05, Alan Burliso

Re: DomainSocket issues on Solaris

2015-10-08 Thread Alan Burlison
On 07/10/2015 22:05, Alan Burlison wrote: I'll draft up a proposal and attach it to HADOOP-11127. Attached to HADOOP-11127 as proposal.txt -- Alan Burlison --

Re: DomainSocket issues on Solaris

2015-10-07 Thread Alan Burlison
On 07/10/15 18:53, Colin P. McCabe wrote: I think you could come up with a select/poll solution while using the old function signatures. A 4-byte int is more than enough information to pass in, given that you can use it as an index into a table in the C code. I have thought about that but a s

Re: DomainSocket issues on Solaris

2015-10-07 Thread Colin P. McCabe
On Wed, Oct 7, 2015 at 9:35 AM, Alan Burlison wrote: > On 06/10/2015 10:52, Steve Loughran wrote: > >> HADOOP-11127, "Improve versioning and compatibility support in native >> library for downstream hadoop-common users." says "we need to do >> better here", which is probably some way of packaging

Re: DomainSocket issues on Solaris

2015-10-07 Thread Alan Burlison
On 06/10/2015 10:52, Steve Loughran wrote: HADOOP-11127, "Improve versioning and compatibility support in native library for downstream hadoop-common users." says "we need to do better here", which is probably some way of packaging native libs. From that JIRA: Colin Patrick McCabe added a co

Re: DomainSocket issues on Solaris

2015-10-06 Thread Steve Loughran
>> >> >> On 10/6/15, 8:25 AM, "Alan Burlison" wrote: >>> > In any case the constraint you are requesting would flat-out > preclude this change, and would also mean that most of the other > JNI changes that have been committed recently would have to be > ripped out as well . In s

Re: DomainSocket issues on Solaris

2015-10-06 Thread Alan Burlison
On 06/10/2015 17:03, Chris Nauroth wrote: Alan, would you please list the specific patches/JIRA issues that broke compatibility? I have not been reviewing the native code lately, so it would help me catch up quickly if you already know which specific patches have introduced problems. If those

Re: DomainSocket issues on Solaris

2015-10-06 Thread Chris Nauroth
I just spotted one: HADOOP-10027. A field was removed from the Java layer, which still could get referenced by an older version of the native layer. A backwards-compatible version of that patch would preserve the old fields in the Java layer. Full disclosure: I was the one who committed that pat

Re: DomainSocket issues on Solaris

2015-10-06 Thread Chris Nauroth
Alan, would you please list the specific patches/JIRA issues that broke compatibility? I have not been reviewing the native code lately, so it would help me catch up quickly if you already know which specific patches have introduced problems. If those patches currently reside only on trunk and br

Re: DomainSocket issues on Solaris

2015-10-06 Thread Alan Burlison
On 06/10/2015 10:52, Steve Loughran wrote: That's not achievable as the method signatures need to change. Even though they are private they need to change from static to normal methods and the signatures need to change as well, as I said. We've done it before, simply by retaining the older met

Re: DomainSocket issues on Solaris

2015-10-06 Thread Alan Burlison
On 06/10/2015 11:01, Steve Loughran wrote: I really don't want to do that as it relegates Solaris to only ever being a second-class citizen. I know that Solaris matters to you 100%, and we've tried to be as supportive as we can, even though it's not viewed as important to anyone else. We don't

Re: DomainSocket issues on Solaris

2015-10-06 Thread Steve Loughran
On 6 Oct 2015, at 00:34, Alan Burlison mailto:alan.burli...@oracle.com>> wrote: On 05/10/15 18:30, Colin P. McCabe wrote: 1. Don't get DomainSocket working on Solaris. Rely on the legacy short-circuit read instead. It has poorer security guarantees, but doesn't require domain sockets. You ca

Re: DomainSocket issues on Solaris

2015-10-06 Thread Steve Loughran
> On 5 Oct 2015, at 15:56, Alan Burlison wrote: > > On 05/10/2015 15:14, Steve Loughran wrote: > >> I don't think anyone would object for the changes, except for one big >> caveat: a lot of us would like that binary file to be backwards >> compatible; a Hadoop 2.6 JAR should be able to link to

Re: DomainSocket issues on Solaris

2015-10-05 Thread Alan Burlison
On 05/10/15 18:30, Colin P. McCabe wrote: 1. Don't get DomainSocket working on Solaris. Rely on the legacy short-circuit read instead. It has poorer security guarantees, but doesn't require domain sockets. You can add a line of code to the failing junit tests to skip them on Solaris. I real

Re: DomainSocket issues on Solaris

2015-10-05 Thread Colin P. McCabe
Hi Alan, As Chris commented earlier, the main use of DomainSocket is to transfer file descriptors from the DataNode to the DFSClient. As you know, this is something that can only be done through domain sockets, not through inet sockets. We do support passing data over domain sockets, but in prac

Re: DomainSocket issues on Solaris

2015-10-05 Thread Alan Burlison
On 05/10/2015 15:14, Steve Loughran wrote: I don't think anyone would object for the changes, except for one big caveat: a lot of us would like that binary file to be backwards compatible; a Hadoop 2.6 JAR should be able to link to the 2.8+ libhadoop. So whatever gets changed, the old methods ar

Re: DomainSocket issues on Solaris

2015-10-05 Thread Steve Loughran
I don't think anyone would object for the changes, except for one big caveat: a lot of us would like that binary file to be backwards compatible; a Hadoop 2.6 JAR should be able to link to the 2.8+ libhadoop. So whatever gets changed, the old methods are still going to hang around > On 2 Oct 20

Re: DomainSocket issues on Solaris

2015-10-02 Thread Alan Burlison
On 30/09/2015 09:14, Alan Burlison wrote: The basic idea is to add two new fields to DomainSocket.c to hold the read/write timeouts. On platforms that support SO_SNDTIMEO and SO_RCVTIMEO these would be unused as setsockopt() would be used to set the socket timeouts. On platforms such as Solaris

Re: DomainSocket issues on Solaris

2015-09-30 Thread Alan Burlison
On 30/09/2015 17:23, Chris Nauroth wrote: I think file descriptor sharing is a capability of Unix domain sockets only, and not INET sockets. Yes, that's correct. -- Alan Burlison --

Re: DomainSocket issues on Solaris

2015-09-30 Thread Chris Nauroth
That's an interesting find, though I don't think we'd be able to swap in INET sockets in this part of the code. We use Unix domain sockets to share an open file descriptor from the DataNode process to the HDFS client process, and then the client reads directly from that open file descriptor. I th

Re: DomainSocket issues on Solaris

2015-09-30 Thread Alan Burlison
On 30/09/2015 16:56, Chris Nauroth wrote: Alan, I also meant to say that I didn't understand the comment about "in production it seems that DomainSocket is less commonly used". The current implementation of short-circuit read definitely utilizes DomainSocket, and it's very common to enable this

Re: DomainSocket issues on Solaris

2015-09-30 Thread Chris Nauroth
Alan, I also meant to say that I didn't understand the comment about "in production it seems that DomainSocket is less commonly used". The current implementation of short-circuit read definitely utilizes DomainSocket, and it's very common to enable this in production clusters. The documentation p

Re: DomainSocket issues on Solaris

2015-09-30 Thread Chris Nauroth
Hello Alan, I think this sounds like a reasonable approach. I recommend that you file a JIRA with the proposal (copy-paste the content of your email into a comment) and then wait a few days before starting work in earnest to see if anyone else wants to discuss it first. I also recommend notifyin