Porting to Solaris

2014-02-18 Thread Malcolm
I have started porting the native libraries of Hadoop 2.2.0 to Solaris and
would like to eventually get my changes into the main Hadoop source tree.

Are there any public coding guidelines/requirements that I should follow ?

Is there a specific procedure to follow once I have completed the work i.e.
who to send the changes to, for example.

Thanks,
Malcolm


Best Linux version for hadoop development

2014-06-13 Thread Malcolm
Running tests on hadoop 2.4, both native and java result in failures on
Ubuntu 14.04.

I understand that for patches to be accepted, the minimum criteria is that
all tests pass, however the vanilla release itself fails quite quickly.

Is this a well known issue ?

Should I be using a different Linux version.

BTW, even on 2.2,  not all tests passed successfully ( on Ubuntu 12.04).

Thanks,
Malcolm


Extract from the test log (running with -fae flag to keep going):

Failed tests:
  TestSymlinkLocalFSFileContext>SymlinkBaseTest.testStatLinkToFile:244 null
  TestSymlinkLocalFSFileContext>SymlinkBaseTest.testStatLinkToDir:286 null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkUsingRelPaths:447->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkUsingAbsPaths:472->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkUsingFullyQualPaths:503->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileContext>TestSymlinkLocalFS.testStatDanglingLink:115->SymlinkBaseTest.testStatDanglingLink:301
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkToDirectory:627
null
  TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkViaLink:679
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testRenameSymlinkViaSymlink:897
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testRenameSymlinkNonExistantDest:1036
null

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testRenameSymlinkToExistingFile:1063
null
  TestSymlinkLocalFSFileContext>SymlinkBaseTest.testRenameSymlink:1134 null
  TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testStatLinkToFile:244 null
  TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testStatLinkToDir:286 null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkUsingRelPaths:447->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkUsingAbsPaths:472->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkUsingFullyQualPaths:503->SymlinkBaseTest.checkLink:381
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkToDirectory:627
null
  TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkViaLink:679
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testRenameSymlinkViaSymlink:897
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testRenameSymlinkNonExistantDest:1036
null

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testRenameSymlinkToExistingFile:1063
null
  TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testRenameSymlink:1134 null

TestSymlinkLocalFSFileSystem>TestSymlinkLocalFS.testStatDanglingLink:115->SymlinkBaseTest.testStatDanglingLink:301
null

Tests in error:
  TestSymlinkLocalFSFileContext>TestSymlinkLocalFS.testDanglingLink:163 »
IO Pat...

TestSymlinkLocalFSFileContext>TestSymlinkLocalFS.testGetLinkStatusPartQualTarget:201
» IO

TestSymlinkLocalFSFileContext>SymlinkBaseTest.testCreateLinkToDotDotPrefix:822
» IO
  TestSymlinkLocalFSFileSystem>TestSymlinkLocalFS.testDanglingLink:163 » IO
Path...

TestSymlinkLocalFSFileSystem>TestSymlinkLocalFS.testGetLinkStatusPartQualTarget:201
» IO

TestSymlinkLocalFSFileSystem>SymlinkBaseTest.testCreateLinkToDotDotPrefix:822
» IO

Tests run: 2285, Failures: 24, Errors: 6, Skipped: 41
[INFO] Apache Hadoop Main  SUCCESS [0.266s]
[INFO] Apache Hadoop Project POM . SUCCESS [0.402s]
[INFO] Apache Hadoop Annotations . SUCCESS [0.462s]
[INFO] Apache Hadoop Project Dist POM  SUCCESS [0.123s]
[INFO] Apache Hadoop Assemblies .. SUCCESS [0.124s]
[INFO] Apache Hadoop Maven Plugins ... SUCCESS [1.263s]
[INFO] Apache Hadoop MiniKDC . SUCCESS [50.558s]
[INFO] Apache Hadoop Auth  SUCCESS
[4:39.256s]
[INFO] Apache Hadoop Auth Examples ... SUCCESS [0.106s]
[INFO] Apache Hadoop Common .. FAILURE
[13:57.764s]
[INFO] Apache Hadoop NFS . SKIPPED
[INFO] Apache Hadoop Common Project .. SUCCESS [0.026s]
[INFO] Apache Hadoop HDFS  SKIPPED
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.027s]
[INFO] hadoop-yarn ... SUCCESS [0.027s]
[INFO] hadoop-yarn-api ... SKIPPED
[INFO] hadoop-yarn-common  SKIPPED
[INFO] hadoop-yarn-server  SUCCESS [0.026s]
[INFO] hadoop-yarn-server-common 

Solaris Port

2014-12-08 Thread malcolm

I have ported Hadoop  native libraries to Solaris 11 (both Sparc and Intel )
Oracle have agreed to release my changes to the community so that 
Solaris platforms can benefit.
Reading the HowToContribute and GitandHadoop documents, I am not 100% 
clear on how to get my changes into the main tree. I am also a Git(hub) 
newbie, and was using svn previously.


Please let me know if I am going the correct path:

1. I forked Hadoop on Github and downloaded a clone to my development 
machine.


2. The changes I made were to 2.2.0, can I still add changes to this 
branch, and hopefully get them accepted or must I migrate my changes to 
2.6 ? (On the main Hadoop download page, 2.2 is still listed as the GA 
version )


3. I understand that I should create a new branch for my changes, and 
then generate pull requests after uploading them to Github.


4. I also registered  at Jira in the understanding that I need to 
generate a Jira number for my changes, and to name my branch accordingly ?


Does all this make sense ?

Thanks,
Malcolm




Re: Solaris Port

2014-12-08 Thread malcolm

Hi Colin,

A short summary of my changes are as follows:

- Native C source files: added 5,  modified 6, requiring also changes to 
CMakeLists.txt. Of course, all changes are "ifdeffed" for Solaris 
appropriately and new files, are prefixed with solaris_ as well.


For example, Solaris does not have errno, or errlist any more which are 
used quite a lot in hadoop native code. I could have replaced all calls 
to use strerror instead which would be compatible with Linux, however in 
the interests of making minimal changes, I recreated and added these 
files from a running Solaris machine instead.


Another issue is that Solaris doesn't have the timeout option for 
sockets, so I had to write my own solaris_read routine with timeout and 
added it to DomainSocket.c . A few issues with lz4 on Sparc needed 
modification, and some other OS specific issues: getgrouplist, 
container-executer (from yarn).


- Some very minor changes were made to some Java source files (mainly 
tests to get them to pass on Solaris)


The above changes were made to 2.2, I will recheck everything against 
the latest trunk, maybe some fixes aren't needed any more.


I have generated a single patch file with all changes. Perhaps it would 
be better to file multiple JIRAs for each change, perhaps grouped, one 
per issue ? Or should I file a JIRA for each modified source file ?


Thank you,
Malcolm

On 12/08/2014 09:53 PM, Colin McCabe wrote:

Hi Malcolm,

It's great that you are going to contribute!  Please make your patches
against trunk.

2.2 is fairly old at this point.  It hasn't been the focus of
development in more than a year.

We don't use github or pull requests.

Check the section on http://wiki.apache.org/hadoop/HowToContribute
that talks about "Contributing your work".  Excerpt:
"Finally, patches should be attached to an issue report in Jira via
the Attach File link on the issue's Jira. Please add a comment that
asks for a code review following our code review checklist. Please
note that the attachment should be granted license to ASF for
inclusion in ASF works (as per the Apache License §5)."

As this says, you attach the patch file to a JIRA that you have
created, and then hit "submit patch."

I don't think a branch is required for this work since it is just
build fixes, right?

best,
Colin


On Mon, Dec 8, 2014 at 3:30 AM, malcolm  wrote:

I have ported Hadoop  native libraries to Solaris 11 (both Sparc and Intel )
Oracle have agreed to release my changes to the community so that Solaris
platforms can benefit.
Reading the HowToContribute and GitandHadoop documents, I am not 100% clear
on how to get my changes into the main tree. I am also a Git(hub) newbie,
and was using svn previously.

Please let me know if I am going the correct path:

1. I forked Hadoop on Github and downloaded a clone to my development
machine.

2. The changes I made were to 2.2.0, can I still add changes to this branch,
and hopefully get them accepted or must I migrate my changes to 2.6 ? (On
the main Hadoop download page, 2.2 is still listed as the GA version )

3. I understand that I should create a new branch for my changes, and then
generate pull requests after uploading them to Github.

4. I also registered  at Jira in the understanding that I need to generate a
Jira number for my changes, and to name my branch accordingly ?

Does all this make sense ?

Thanks,
Malcolm






Re: Solaris Port

2014-12-10 Thread malcolm

Hi Colin,

Thanks for the hints around JIRAs.

You are correct errno still exists, however sys_errlist does not.

Hadoop uses a function terror (defined in exception.c) which indexes 
sys_errlist by errno to return the error message from the array. This 
function is called 26 times in various places (in 2.2)


Originally, I thought to replace all calls to terror with strerror, but 
there can be issues with multi-threading (it returns a buffer which can 
be overwritten), so it seemed simpler just to recreate the sys_errlist 
message array.


There is also a multi-threaded version strerror_r where you pass the 
buffer as a parameter, but this would necessitate changing every call to 
terror with mutiple lines of code.


Sorry, I wasn't clear.

Also, I have been requested to ensure my port is available on 2.4, 
perceived as a more stable release. If I make changes to this branch are 
they automatically available for 2.6, or will I need multiple JIRAs ?


Thanks,
Malcolm

On 12/10/2014 10:45 AM, Colin McCabe wrote:

Hi Malcolm,

In general we file JIRAs for particular issues.  So if one issue is
handling errlist on Solaris, that might be one JIRA.  Another issue
might be handling socket write timeouts on Solaris.  And so on.  Most
of these should probably be HADOOP tickets since they sound like they
are mostly in the generic hadoop-common code.

"solaris does not have errno" seems like a bold statement.  errno is
part of POSIX, and Solaris is a POSIX os, right?  Am I way off base on
this?
I googled around and one of the first results I found talked about
errno values on Solaris.
http://www.pixelstech.net/article/1413273556-A-trick-of-building-multithreaded-application-on-Solaris
  Perhaps I misunderstood what you meant by this statement.

Anyway, please file JIRAs for any portability improvements you can think of!

best,
Colin

On Mon, Dec 8, 2014 at 9:09 PM, malcolm  wrote:

Hi Colin,

A short summary of my changes are as follows:

- Native C source files: added 5,  modified 6, requiring also changes to
CMakeLists.txt. Of course, all changes are "ifdeffed" for Solaris
appropriately and new files, are prefixed with solaris_ as well.

For example, Solaris does not have errno, or errlist any more which are used
quite a lot in hadoop native code. I could have replaced all calls to use
strerror instead which would be compatible with Linux, however in the
interests of making minimal changes, I recreated and added these files from
a running Solaris machine instead.

Another issue is that Solaris doesn't have the timeout option for sockets,
so I had to write my own solaris_read routine with timeout and added it to
DomainSocket.c . A few issues with lz4 on Sparc needed modification, and
some other OS specific issues: getgrouplist, container-executer (from yarn).

- Some very minor changes were made to some Java source files (mainly tests
to get them to pass on Solaris)

The above changes were made to 2.2, I will recheck everything against the
latest trunk, maybe some fixes aren't needed any more.

I have generated a single patch file with all changes. Perhaps it would be
better to file multiple JIRAs for each change, perhaps grouped, one per
issue ? Or should I file a JIRA for each modified source file ?

Thank you,
Malcolm


On 12/08/2014 09:53 PM, Colin McCabe wrote:

Hi Malcolm,

It's great that you are going to contribute!  Please make your patches
against trunk.

2.2 is fairly old at this point.  It hasn't been the focus of
development in more than a year.

We don't use github or pull requests.

Check the section on http://wiki.apache.org/hadoop/HowToContribute
that talks about "Contributing your work".  Excerpt:
"Finally, patches should be attached to an issue report in Jira via
the Attach File link on the issue's Jira. Please add a comment that
asks for a code review following our code review checklist. Please
note that the attachment should be granted license to ASF for
inclusion in ASF works (as per the Apache License §5)."

As this says, you attach the patch file to a JIRA that you have
created, and then hit "submit patch."

I don't think a branch is required for this work since it is just
build fixes, right?

best,
Colin


On Mon, Dec 8, 2014 at 3:30 AM, malcolm 
wrote:

I have ported Hadoop  native libraries to Solaris 11 (both Sparc and
Intel )
Oracle have agreed to release my changes to the community so that Solaris
platforms can benefit.
Reading the HowToContribute and GitandHadoop documents, I am not 100%
clear
on how to get my changes into the main tree. I am also a Git(hub) newbie,
and was using svn previously.

Please let me know if I am going the correct path:

1. I forked Hadoop on Github and downloaded a clone to my development
machine.

2. The changes I made were to 2.2.0, can I still add changes to this
branch,
and hopefully get them accepted or must I migrate my changes to 2.6 ? (On
the main Hadoop down

Re: Solaris Port

2014-12-10 Thread malcolm

Hi Colin,

Exactly, as you noticed, the problem is the thread-local buffer needed 
to return from terror.
Currently, terror just returns a static string from an array, this is 
fast, simple and error-proof.


In order to use strerror_r inside terror,  would require allocating a 
buffer inside terror  and depend on the caller to free the buffer after 
using it, or to pass a buffer to terrror (which is basically the same as 
strerror_r, rendering terror redundant).
Both cases require modification outside terror itself, as far as I can 
tell, no simple fix. Unless you have an alternative which I haven't 
thought of ?


As far as I can tell, we have two choices:

1. Remove terror and replace calls with strerror_r, passing a buffer 
from the callee.

Advantage: a more modern portable interface.
Disadvantage: All calls to terror need to be modified, though all 
seem to be in a few files as far as I can tell.


2. Adding a sys_errlist array (ifdeffed for Solaris)
Advantage: no change to any calls to terror
Disadvantage: 2 additional files added to source tree (.c and .h) 
and some minor ifdefs only used for Solaris.


I think it is more a question of style than anything else, so I leave 
you to make the call.


Thanks for your patience,
Malcolm





On 12/10/2014 09:54 PM, Colin McCabe wrote:

On Wed, Dec 10, 2014 at 2:31 AM, malcolm  wrote:

Hi Colin,

Thanks for the hints around JIRAs.

You are correct errno still exists, however sys_errlist does not.

Hadoop uses a function terror (defined in exception.c) which indexes
sys_errlist by errno to return the error message from the array. This
function is called 26 times in various places (in 2.2)

Originally, I thought to replace all calls to terror with strerror, but
there can be issues with multi-threading (it returns a buffer which can be
overwritten), so it seemed simpler just to recreate the sys_errlist message
array.

There is also a multi-threaded version strerror_r where you pass the buffer
as a parameter, but this would necessitate changing every call to terror
with mutiple lines of code.

Why don't you just use strerror_r inside terror()?

I wrote that code originally.  The reason I didn't want to use
strerror_r there is because GNU libc provides a non-POSIX definition
of strerror_r, and forcing it to use the POSIX one is a pain.  But you
can do it.  You also will require a thread-local buffer to hold the
return from strerror_r, since it is not guaranteed to be static
(although in practice it is 99% of the time-- another annoyance with
the API).






Re: Solaris Port

2014-12-11 Thread malcolm
FYI, there are a couple more files that reference sys_errlist directly 
(not just terror within exception.c) , but also hdfs_http_client.c and 
NativeiO.c


On 12/11/2014 07:38 AM, malcolm wrote:

Hi Colin,

Exactly, as you noticed, the problem is the thread-local buffer needed 
to return from terror.
Currently, terror just returns a static string from an array, this is 
fast, simple and error-proof.


In order to use strerror_r inside terror,  would require allocating a 
buffer inside terror  and depend on the caller to free the buffer 
after using it, or to pass a buffer to terrror (which is basically the 
same as strerror_r, rendering terror redundant).
Both cases require modification outside terror itself, as far as I can 
tell, no simple fix. Unless you have an alternative which I haven't 
thought of ?


As far as I can tell, we have two choices:

1. Remove terror and replace calls with strerror_r, passing a buffer 
from the callee.

Advantage: a more modern portable interface.
Disadvantage: All calls to terror need to be modified, though all 
seem to be in a few files as far as I can tell.


2. Adding a sys_errlist array (ifdeffed for Solaris)
Advantage: no change to any calls to terror
Disadvantage: 2 additional files added to source tree (.c and .h) 
and some minor ifdefs only used for Solaris.


I think it is more a question of style than anything else, so I leave 
you to make the call.


Thanks for your patience,
Malcolm





On 12/10/2014 09:54 PM, Colin McCabe wrote:
On Wed, Dec 10, 2014 at 2:31 AM, malcolm 
 wrote:

Hi Colin,

Thanks for the hints around JIRAs.

You are correct errno still exists, however sys_errlist does not.

Hadoop uses a function terror (defined in exception.c) which indexes
sys_errlist by errno to return the error message from the array. This
function is called 26 times in various places (in 2.2)

Originally, I thought to replace all calls to terror with strerror, but
there can be issues with multi-threading (it returns a buffer which 
can be
overwritten), so it seemed simpler just to recreate the sys_errlist 
message

array.

There is also a multi-threaded version strerror_r where you pass the 
buffer
as a parameter, but this would necessitate changing every call to 
terror

with mutiple lines of code.

Why don't you just use strerror_r inside terror()?

I wrote that code originally.  The reason I didn't want to use
strerror_r there is because GNU libc provides a non-POSIX definition
of strerror_r, and forcing it to use the POSIX one is a pain. But you
can do it.  You also will require a thread-local buffer to hold the
return from strerror_r, since it is not guaranteed to be static
(although in practice it is 99% of the time-- another annoyance with
the API).








Re: Solaris Port

2014-12-11 Thread malcolm

Fine with me, I volunteer to do this, if accepted.

On 12/11/2014 05:48 PM, Allen Wittenauer wrote:

sys_errlist was removed for a reason.  Creating a fake sys_errlist on Solaris 
will mean the libhadoop.so will need to be tied a specific build 
(kernel/include pairing) and therefore limits upward mobility/compatibility.  
That doesn’t seem like a very good idea.

IMO, switching to strerror_r is much preferred, since other than the brain-dead 
GNU libc version, is highly portable and should work regardless of the kernel 
or OS in place.

On Dec 11, 2014, at 5:20 AM, malcolm  wrote:


FYI, there are a couple more files that reference sys_errlist directly (not 
just terror within exception.c) , but also hdfs_http_client.c and NativeiO.c

On 12/11/2014 07:38 AM, malcolm wrote:

Hi Colin,

Exactly, as you noticed, the problem is the thread-local buffer needed to 
return from terror.
Currently, terror just returns a static string from an array, this is fast, 
simple and error-proof.

In order to use strerror_r inside terror,  would require allocating a buffer 
inside terror  and depend on the caller to free the buffer after using it, or 
to pass a buffer to terrror (which is basically the same as strerror_r, 
rendering terror redundant).
Both cases require modification outside terror itself, as far as I can tell, no 
simple fix. Unless you have an alternative which I haven't thought of ?

As far as I can tell, we have two choices:

1. Remove terror and replace calls with strerror_r, passing a buffer from the 
callee.
Advantage: a more modern portable interface.
Disadvantage: All calls to terror need to be modified, though all seem to 
be in a few files as far as I can tell.

2. Adding a sys_errlist array (ifdeffed for Solaris)
Advantage: no change to any calls to terror
Disadvantage: 2 additional files added to source tree (.c and .h) and some 
minor ifdefs only used for Solaris.

I think it is more a question of style than anything else, so I leave you to 
make the call.

Thanks for your patience,
Malcolm





On 12/10/2014 09:54 PM, Colin McCabe wrote:

On Wed, Dec 10, 2014 at 2:31 AM, malcolm  wrote:

Hi Colin,

Thanks for the hints around JIRAs.

You are correct errno still exists, however sys_errlist does not.

Hadoop uses a function terror (defined in exception.c) which indexes
sys_errlist by errno to return the error message from the array. This
function is called 26 times in various places (in 2.2)

Originally, I thought to replace all calls to terror with strerror, but
there can be issues with multi-threading (it returns a buffer which can be
overwritten), so it seemed simpler just to recreate the sys_errlist message
array.

There is also a multi-threaded version strerror_r where you pass the buffer
as a parameter, but this would necessitate changing every call to terror
with mutiple lines of code.

Why don't you just use strerror_r inside terror()?

I wrote that code originally.  The reason I didn't want to use
strerror_r there is because GNU libc provides a non-POSIX definition
of strerror_r, and forcing it to use the POSIX one is a pain. But you
can do it.  You also will require a thread-local buffer to hold the
return from strerror_r, since it is not guaranteed to be static
(although in practice it is 99% of the time-- another annoyance with
the API).






Re: Solaris Port

2014-12-11 Thread malcolm

Hi Asok,

I googled and found that windows has strerror, and strerror_s (which is 
the strerror_r equivalent).

Is there a reason why you didn't use this call ?

On 12/11/2014 06:27 PM, Asokan, M wrote:

Hi Malcom,
Recently, I had to work on a function to get system error message on 
various systems.  Here is the piece of code I came up with.  Hope it helps.

static void get_system_error_message(char * buf, int buf_len, int code)
{
#if defined(_WIN32)
 LPVOID lpMsgBuf;
 DWORD status = FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
  FORMAT_MESSAGE_FROM_SYSTEM |
  FORMAT_MESSAGE_IGNORE_INSERTS,
  NULL, code,
  MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
  /* Default language */
  (LPTSTR) &lpMsgBuf, 0, NULL);
 if (status > 0)
 {
 strncpy(buf, (char *)lpMsgBuf, buf_len-1);
 buf[buf_len-1] = '\0';
 /* Free the buffer returned by system */
 LocalFree(lpMsgBuf);
 }
 else
 {
 _snprintf(buf, buf_len-1 , "%s %d",
 "Can't get system error message for code", code);
 buf[buf_len-1] = '\0';
 }
#else
#if defined(_HPUX_SOURCE)
 {
 char * msg;
 errno = 0;
 msg = strerror(code);
 if (errno == 0)
 {
 strncpy(buf, msg, buf_len-1);
 buf[buf_len-1] = '\0';
 }
 else
 {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
 }
 }
#else
 if (strerror_r(code, buf, buf_len) != 0)
 {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
 }
#endif
#endif
}

Note that HPUX does not have strerror_r() since strerror() itself is 
thread-safe.  Also Windows does not have snprintf().  The equivalent function 
_snprintf() has a subtle difference in its interface.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Thursday, December 11, 2014 11:02 AM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port

Fine with me, I volunteer to do this, if accepted.

On 12/11/2014 05:48 PM, Allen Wittenauer wrote:

sys_errlist was removed for a reason.  Creating a fake sys_errlist on Solaris 
will mean the libhadoop.so will need to be tied a specific build 
(kernel/include pairing) and therefore limits upward mobility/compatibility.  
That doesn’t seem like a very good idea.

IMO, switching to strerror_r is much preferred, since other than the brain-dead 
GNU libc version, is highly portable and should work regardless of the kernel 
or OS in place.

On Dec 11, 2014, at 5:20 AM, malcolm  wrote:


FYI, there are a couple more files that reference sys_errlist directly (not 
just terror within exception.c) , but also hdfs_http_client.c and NativeiO.c

On 12/11/2014 07:38 AM, malcolm wrote:

Hi Colin,

Exactly, as you noticed, the problem is the thread-local buffer needed to 
return from terror.
Currently, terror just returns a static string from an array, this is fast, 
simple and error-proof.

In order to use strerror_r inside terror,  would require allocating a buffer 
inside terror  and depend on the caller to free the buffer after using it, or 
to pass a buffer to terrror (which is basically the same as strerror_r, 
rendering terror redundant).
Both cases require modification outside terror itself, as far as I can tell, no 
simple fix. Unless you have an alternative which I haven't thought of ?

As far as I can tell, we have two choices:

1. Remove terror and replace calls with strerror_r, passing a buffer from the 
callee.
 Advantage: a more modern portable interface.
 Disadvantage: All calls to terror need to be modified, though all seem to 
be in a few files as far as I can tell.

2. Adding a sys_errlist array (ifdeffed for Solaris)
 Advantage: no change to any calls to terror
 Disadvantage: 2 additional files added to source tree (.c and .h) and some 
minor ifdefs only used for Solaris.

I think it is more a question of style than anything else, so I leave you to 
make the call.

Thanks for your patience,
Malcolm





On 12/10/2014 09:54 PM, Colin McCabe wrote:

On Wed, Dec 10, 2014 at 2:31 AM, malcolm  wrote:

Hi Colin,

Thanks for the hints around JIRAs.

You are correct errno still exists, however sys_errlist does not.

Hadoop uses a function terror (defined in exception.c) which indexes
sys_errlist by errno to return the error message from the array. This
function is called 26 times in various places (in 2.2)

Originally, I thought to replace all calls to terror with strerror, but
there can be issues with multi-threading (it 

Re: Solaris Port

2014-12-11 Thread malcolm
So, turns out that if I had naively changed all calls to terror or 
references to sys_errlist, to using strerror_r, then I would have broken 
code for Windows and HPUX (and possibly other OSes).


If we are to assume that current code runs fine on all platforms (maybe 
even AIX an MacOS, for example), then any change/additions made to the 
code and not ifdeffed appropriately can break on other OSes. On the 
other hand,  too many ifdefs can pollute the code source and render it 
less readable (though possibly less important).


In the general case what are code contributors responsibilities to 
adding code regarding OSes besides Linux ?

What OSes does jenkins test on ?
I guess maintainers of code on non-tested platforms are responsible for 
their own testing ?


How do we avoid the ping-pong effect, i.e. I make a generic change to 
code which breaks on Windows, then the Windows maintainer reverts 
changes to break on Solaris for example ? Or does this not happen in 
actuality ?


On 12/11/2014 11:25 PM, Asokan, M wrote:

Hi Malcom,
   The Windows versions of strerror() and strerror_s() functions are probably meant for 
ANSI C library functions that set errno.  For core Windows API calls (like UNIX system 
calls), one gets the error number by calling GetLastError() function.  In the code 
snippet I sent earlier, the "code" argument is the value returned by 
GetLastError().  Neither strerror() nor strerror_s() will give the correct error message 
for this error code.

You could probably look at libwinutils.c in Hadoop source.  It uses 
FormatMessageW (which returns messages in Unicode.)  My requirement was to 
return messages in current system locale.

-- Asokan
____
From: malcolm [malcolm.kaval...@oracle.com]
Sent: Thursday, December 11, 2014 4:04 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port

Hi Asok,

I googled and found that windows has strerror, and strerror_s (which is
the strerror_r equivalent).
Is there a reason why you didn't use this call ?

On 12/11/2014 06:27 PM, Asokan, M wrote:

Hi Malcom,
 Recently, I had to work on a function to get system error message on 
various systems.  Here is the piece of code I came up with.  Hope it helps.

static void get_system_error_message(char * buf, int buf_len, int code)
{
#if defined(_WIN32)
  LPVOID lpMsgBuf;
  DWORD status = FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
   FORMAT_MESSAGE_FROM_SYSTEM |
   FORMAT_MESSAGE_IGNORE_INSERTS,
   NULL, code,
   MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
   /* Default language 
*/
   (LPTSTR) &lpMsgBuf, 0, NULL);
  if (status > 0)
  {
  strncpy(buf, (char *)lpMsgBuf, buf_len-1);
  buf[buf_len-1] = '\0';
  /* Free the buffer returned by system */
  LocalFree(lpMsgBuf);
  }
  else
  {
  _snprintf(buf, buf_len-1 , "%s %d",
  "Can't get system error message for code", code);
  buf[buf_len-1] = '\0';
  }
#else
#if defined(_HPUX_SOURCE)
  {
  char * msg;
  errno = 0;
  msg = strerror(code);
  if (errno == 0)
  {
  strncpy(buf, msg, buf_len-1);
  buf[buf_len-1] = '\0';
  }
  else
  {
  snprintf(buf, buf_len, "%s %d",
  "Can't get system error message for code", code);
  }
  }
#else
  if (strerror_r(code, buf, buf_len) != 0)
  {
  snprintf(buf, buf_len, "%s %d",
  "Can't get system error message for code", code);
  }
#endif
#endif
}

Note that HPUX does not have strerror_r() since strerror() itself is 
thread-safe.  Also Windows does not have snprintf().  The equivalent function 
_snprintf() has a subtle difference in its interface.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Thursday, December 11, 2014 11:02 AM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port

Fine with me, I volunteer to do this, if accepted.

On 12/11/2014 05:48 PM, Allen Wittenauer wrote:

sys_errlist was removed for a reason.  Creating a fake sys_errlist on Solaris 
will mean the libhadoop.so will need to be tied a specific build 
(kernel/include pairing) and therefore limits upward mobility/compatibility.  
That doesn’t seem like a very good idea.

IMO, switching to strerror_r is much preferred, since other than the brain-dead 
GNU libc version, is highly portable and should work regardless of the kernel 
or OS in place.

On Dec 11, 2014, at 5:20 AM, malcolm  wrote:


FYI, there are a couple more files that reference sys_errlist directly

Re: Solaris Port

2014-12-13 Thread malcolm

Colin,

I am not sure what you mean by a thread-local buffer (in native code). 
In Java this is pretty standard, but I couldn't find any implementation 
for C code.


Here is the terror function:

const char* terror(int errnum)
{
  if ((errnum < 0) || (errnum >= sys_nerr)) {
return "unknown error.";
  }
  return sys_errlist[errnum];
}


The interface is identical to strerror, but the implementation is 
actually re-entrant since it returns a pointer to a static string.


If I understand your suggestion, the new function would look like this:

   const char* terror(int errnum)
   {
  static char result[65];

  strerror_r(errnum, result, 64);

  return result;
   }

No need for snprintf, strerror_r  has the 'n' bounding built-in.

Of course, this is still non-re-entrant, so unless the caller copies the 
returned buffer, before the function is called again, there is a problem.


After considerable thought, I have come up with this version of terror, 
tested OK on Windows, Linux and Solaris:


   #if defined(_WIN32)
   #define strerror_r(errno,buf,len) strerror_s(buf,len,errno)
   #endif

   #define MAX_ERRORS 256
   #define MAX_ERROR_LEN 80

   char *terror(int errnum)
   {

  static char errlist[MAX_ERRORS][MAX_ERROR_LEN+1]; // cache of
   error messages

  if ( errnum >= 0 && errnum < MAX_ERRORS )
{
  if ( errlist[errnum][0] == 0 )
strerror_r( errnum, errlist[errnum], MAX_ERROR_LEN);

  return errlist[errnum];
}
  else
{
  return "Unknown error";
}
   }

This version is portable and re-entrant.

On windows, the largest errnum is 43, on Ubuntu 14.04 we have 133, and 
on Solaris 11.1 we get 151.


If this is OK with you, I will open a jira for this.


Thanks,
Malcolm


On 12/12/2014 11:10 PM, Colin McCabe wrote:

Just use snprintf to copy the error message from strerror_r into a
thread-local buffer of 64 bytes or so.  Then preserve the existing
terror() interface.

Can you open a jira for this?

best,
Colin

On Thu, Dec 11, 2014 at 8:35 PM, malcolm  wrote:

So, turns out that if I had naively changed all calls to terror or
references to sys_errlist, to using strerror_r, then I would have broken
code for Windows and HPUX (and possibly other OSes).

If we are to assume that current code runs fine on all platforms (maybe even
AIX an MacOS, for example), then any change/additions made to the code and
not ifdeffed appropriately can break on other OSes. On the other hand,  too
many ifdefs can pollute the code source and render it less readable (though
possibly less important).

In the general case what are code contributors responsibilities to adding
code regarding OSes besides Linux ?
What OSes does jenkins test on ?
I guess maintainers of code on non-tested platforms are responsible for
their own testing ?

How do we avoid the ping-pong effect, i.e. I make a generic change to code
which breaks on Windows, then the Windows maintainer reverts changes to
break on Solaris for example ? Or does this not happen in actuality ?


On 12/11/2014 11:25 PM, Asokan, M wrote:

Hi Malcom,
The Windows versions of strerror() and strerror_s() functions are
probably meant for ANSI C library functions that set errno.  For core
Windows API calls (like UNIX system calls), one gets the error number by
calling GetLastError() function.  In the code snippet I sent earlier, the
"code" argument is the value returned by GetLastError().  Neither strerror()
nor strerror_s() will give the correct error message for this error code.

You could probably look at libwinutils.c in Hadoop source.  It uses
FormatMessageW (which returns messages in Unicode.)  My requirement was to
return messages in current system locale.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Thursday, December 11, 2014 4:04 PM
To:common-dev@hadoop.apache.org
Subject: Re: Solaris Port

Hi Asok,

I googled and found that windows has strerror, and strerror_s (which is
the strerror_r equivalent).
Is there a reason why you didn't use this call ?

On 12/11/2014 06:27 PM, Asokan, M wrote:

Hi Malcom,
  Recently, I had to work on a function to get system error message on
various systems.  Here is the piece of code I came up with.  Hope it helps.

static void get_system_error_message(char * buf, int buf_len, int code)
{
#if defined(_WIN32)
   LPVOID lpMsgBuf;
   DWORD status = FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
NULL, code,
MAKELANGID(LANG_NEUTRAL,
SUBLANG_DEFAULT),
/* Default
language */
(LPTSTR) &lpMsgBu

Re: Solaris Port

2014-12-13 Thread malcolm

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and 
quite specific to Gnu.


Intialization of the static errlist array should be thread safe i.e. 
initially the array is nulled out, and afterwards if two threads write 
to the same address, then they would be writing the same string.


But if we are ok with changing 5 files, not just terror, then I would 
just remove terror completely and use strerror_r (or the alternatives 
for Windows and HP_UX) in the caller code instead i.e. using your 
suggestion for a local buffer in the caller, wherever needed. The more I 
think about it, the more this seems to be the right thing to do.


Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems to handle 
error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep FormatMessage|awk -F: 
'{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c


$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk -F: 
'{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c


This means you need not worry about the Windows version of terror(). You need 
to change five files that contain UNIX specific native code.

I have a question on your suggested implementation:

How do you initialize the static errlist array in a thread-safe manner?


Here is another thread-safe implementation that I could come up with:

#include 
#include 
#include 
#include 

#define MESSAGE_BUFFER_SIZE 256

char * getSystemErrorMessage(char * buf, int buf_len, int code) {
#if defined(_HPUX_SOURCE)
   char * msg;
   errno = 0;
   msg = strerror(code);
   if (errno == 0) {
 strncpy(buf, msg, buf_len-1);
 buf[buf_len-1] = '\0';
   } else {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#else
   if (strerror_r(code, buf, buf_len) != 0) {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#endif
   return buf;
}

#define TERROR(code) \
getSystemErrorMessage(messageBuffer, sizeof(messageBuffer), code)

int main(int argc, char ** argv) {
   if (argc > 1) {
 char messageBuffer[MESSAGE_BUFFER_SIZE];
 int code = atoi(argv[1]);

 fprintf(stderr, "System error for code %s: %s\n", argv[1],  TERROR(code));
   }
   return 0;
}


This changes terror to a macro TERROR and requires all functions that call 
TERROR macro to declare the local variable messageBuffer. Since there are only 
five files to modify, I think it is not a big effort. What do you think?

-- Asokan

On 12/13/2014 04:29 AM, malcolm wrote:
Colin,

I am not sure what you mean by a thread-local buffer (in native code). In Java 
this is pretty standard, but I couldn't find any implementation for C code.

Here is the terror function:

 const char* terror(int errnum)
 {
   if ((errnum < 0) || (errnum >= sys_nerr)) {
 return "unknown error.";
   }
   return sys_errlist[errnum];
 }


The interface is identical to strerror, but the implementation is actually 
re-entrant since it returns a pointer to a static string.

If I understand your suggestion, the new function would look like this:

const char* terror(int errnum)
{
   static char result[65];

   strerror_r(errnum, result, 64);

   return result;
}

No need for snprintf, strerror_r  has the 'n' bounding built-in.

Of course, this is still non-re-entrant, so unless the caller copies the 
returned buffer, before the function is called again, there is a problem.

After consid

Re: Solaris Port SOLVED!

2014-12-13 Thread malcolm

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code 
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe 
! (Just like HPUX)


So, after all this effort, all I need to do is modify terror as follows:

   const char* terror(int errnum)
   {

   #if defined(__sun)
  return strerror(errnum); //  MT-Safe under Solaris
   #else
  if ((errnum < 0) || (errnum >= sys_nerr)) {
return "unknown error.";
  }
  return sys_errlist[errnum];
   #endif
   }

And in two other files where sys_errlist is referenced directly 
(NativeIO and hdfs_http_client.c), I replaced this direct access instead 
with a call to terror.


Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and 
quite specific to Gnu.


Intialization of the static errlist array should be thread safe i.e. 
initially the array is nulled out, and afterwards if two threads write 
to the same address, then they would be writing the same string.


But if we are ok with changing 5 files, not just terror, then I would 
just remove terror completely and use strerror_r (or the alternatives 
for Windows and HP_UX) in the caller code instead i.e. using your 
suggestion for a local buffer in the caller, wherever needed. The more 
I think about it, the more this seems to be the right thing to do.


Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems 
to handle error messages properly. Here is what I found:


$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep 
FormatMessage|awk -F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c 




$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk 
-F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c 




This means you need not worry about the Windows version of terror(). 
You need to change five files that contain UNIX specific native code.


I have a question on your suggested implementation:

How do you initialize the static errlist array in a thread-safe manner?


Here is another thread-safe implementation that I could come up with:

#include 
#include 
#include 
#include 

#define MESSAGE_BUFFER_SIZE 256

char * getSystemErrorMessage(char * buf, int buf_len, int code) {
#if defined(_HPUX_SOURCE)
   char * msg;
   errno = 0;
   msg = strerror(code);
   if (errno == 0) {
 strncpy(buf, msg, buf_len-1);
 buf[buf_len-1] = '\0';
   } else {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#else
   if (strerror_r(code, buf, buf_len) != 0) {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#endif
   return buf;
}

#define TERROR(code) \
getSystemErrorMessage(messageBuffer, sizeof(messageBuffer), code)

int main(int argc, char ** argv) {
   if (argc > 1) {
 char messageBuffer[MESSAGE_BUFFER_SIZE];
 int code = atoi(argv[1]);

 fprintf(stderr, "System error for code %s: %s\n", argv[1], 
TERROR(code));

   }
   return 0;
}


This changes terror to a macro TERROR and requires all functions that 
call TERROR macro to declare the local variable messageBuffer. Since 
there are only five files to modify, I think it is not a big effort. 
What do you think?


-- Asokan

On 12/13/2014 04:29 AM, malcolm wrote:
Colin,

I am not sure what you mean by a thread-local buffer (in native 
code). In Java this is pretty standard, but I 

Re: Solaris Port SOLVED!

2014-12-13 Thread malcolm
I am checking on the latest release of Solaris 11 and yes, it is still 
thread safe (or MT Safe as documented on the man page).


strerror checks the error code, and returns the same "unknown error" 
string as terror does, if it receives an invalid code. I checked this on 
Windows, Solaris and Linux (though my changes only affect Solaris 
platforms).


JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the 
trunk, asking for reviewers in the comments section.

Is there any other protocol I should follow ?

Thanks,
Malcolm

On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
That's great! Is strerror() thread-safe in the recent version of Solaris?  
In any case, to be correct you still need to make sure that the code passed to 
strerror() is a valid one.  For this you need to check errno after the call to 
strerror().  Please check the code snippet I sent earlier for HPUX.

-- Asokan
____
From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as follows:

 const char* terror(int errnum)
 {

 #if defined(__sun)
return strerror(errnum); //  MT-Safe under Solaris
 #else
if ((errnum < 0) || (errnum >= sys_nerr)) {
  return "unknown error.";
}
return sys_errlist[errnum];
 #endif
 }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access instead
with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and
quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads write
to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The more
I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
 Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems
to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop how to file a 
jira
FormatMessage|awk -F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c



$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c



This means you need not worry about the Windows version of terror().
You need to change five files that contain UNIX specific native code.

I have a question on your suggested implementation:

How do you initialize the static errlist array in a thread-safe manner?


Here is another thread-safe implementation that I could come up with:

#include 
#include 
#include 
#include 

#define MESSAGE_BUFFER_SIZE 256

char * getSystemErrorMessage(char * buf, int buf_len, int code) {
#if defined(_HPUX_SOURCE)
char * msg;
errno = 0;
msg = strerror(code);
   

Re: Solaris Port SOLVED!

2014-12-15 Thread malcolm

Done, and added the comment as you requested.
I attached a second patch file to the JIRA (with .002 appended as per 
convention) assuming Jenkins knows to take the latest version, since I 
understand that I cannot remove the previous patch file .


On 12/16/2014 04:12 AM, Colin McCabe wrote:

Thanks, Malcom.  I reviewed it.  The only thing you still have to do
is hit "submit patch" to get a Jenkins run.  See our HowToContribute
wiki page for more details.

wiki.apache.org/hadoop/HowToContribute

best,
Colin

On Sat, Dec 13, 2014 at 9:22 PM, malcolm  wrote:

I am checking on the latest release of Solaris 11 and yes, it is still
thread safe (or MT Safe as documented on the man page).

strerror checks the error code, and returns the same "unknown error" string
as terror does, if it receives an invalid code. I checked this on Windows,
Solaris and Linux (though my changes only affect Solaris platforms).

JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the trunk,
asking for reviewers in the comments section.
Is there any other protocol I should follow ?

Thanks,
Malcolm


On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
 That's great! Is strerror() thread-safe in the recent version of
Solaris?  In any case, to be correct you still need to make sure that the
code passed to strerror() is a valid one.  For this you need to check errno
after the call to strerror().  Please check the code snippet I sent earlier
for HPUX.

-- Asokan
________
From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as follows:

  const char* terror(int errnum)
  {

  #if defined(__sun)
 return strerror(errnum); //  MT-Safe under Solaris
  #else
 if ((errnum < 0) || (errnum >= sys_nerr)) {
   return "unknown error.";
 }
 return sys_errlist[errnum];
  #endif
  }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access instead
with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and
quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads write
to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The more
I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
  Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems
to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop how to
file a jira

FormatMessage|awk -F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c



$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c



This means y

Re: Solaris Port SOLVED!

2014-12-16 Thread malcolm

This is weird, Jenkins complains about:

1. Findbugs , 3 warnings in Java code (which of course I did not touch)
2. Test failures also with no connection to terror: A java socket timeout,

As a newbie, I am not quite sure how to relate to this.
(I could just revert the code back, and see if I get the same errors 
anyway.)


On 12/16/2014 06:57 AM, malcolm wrote:

Done, and added the comment as you requested.
I attached a second patch file to the JIRA (with .002 appended as per 
convention) assuming Jenkins knows to take the latest version, since I 
understand that I cannot remove the previous patch file .


On 12/16/2014 04:12 AM, Colin McCabe wrote:

Thanks, Malcom.  I reviewed it.  The only thing you still have to do
is hit "submit patch" to get a Jenkins run.  See our HowToContribute
wiki page for more details.

wiki.apache.org/hadoop/HowToContribute

best,
Colin

On Sat, Dec 13, 2014 at 9:22 PM, malcolm 
 wrote:

I am checking on the latest release of Solaris 11 and yes, it is still
thread safe (or MT Safe as documented on the man page).

strerror checks the error code, and returns the same "unknown error" 
string
as terror does, if it receives an invalid code. I checked this on 
Windows,

Solaris and Linux (though my changes only affect Solaris platforms).

JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the 
trunk,

asking for reviewers in the comments section.
Is there any other protocol I should follow ?

Thanks,
Malcolm


On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
 That's great! Is strerror() thread-safe in the recent version of
Solaris?  In any case, to be correct you still need to make sure 
that the
code passed to strerror() is a valid one.  For this you need to 
check errno
after the call to strerror().  Please check the code snippet I sent 
earlier

for HPUX.

-- Asokan
________
From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is 
MT-Safe

! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as 
follows:


  const char* terror(int errnum)
  {

  #if defined(__sun)
 return strerror(errnum); //  MT-Safe under Solaris
  #else
 if ((errnum < 0) || (errnum >= sys_nerr)) {
   return "unknown error.";
 }
 return sys_errlist[errnum];
  #endif
  }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access 
instead

with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though 
and

quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads 
write

to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The 
more

I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
  Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop 
seems

to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop 
how to

file a jira

FormatMessage|awk -F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c 





$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop

CPU allocation when running YARN on Docker

2019-04-01 Thread Malcolm McFarland
Hey folks,

(apologies if this is a duplicate, I don't think my first message went
through)

I'm running Samza 0.14.1 with YARN 2.6.1 on Docker 18.06.1 in ECS. (I know
YARN on Docker is somewhat unorthodox, but it's how the ops team at our
company has things setup.) It's running quite well overall -- I have 2
resource managers and 3 node managers communicating smoothly.

The trouble is occurring with the application container CPU allocation. I'm
running Samza on this cluster, and although it starts up just fine and
works fine when it requests 1 CPU/container, it won't start any container
with more than 1 core. I see this message in the Samza log: "Got AM
register response. The YARN RM supports container requests with max-mem:
16384, max-cpu: 1".

Looking at the source code, this is derived from the
RegisterApplicationMasterResponse returned from
AMRMClientAsync.registerApplicationMaster(). I'm trying to trace how YARN
determines the result for
response.getMaximumResourceCapability().getVirtualCores(), and it's a bit
difficult. Does anybody have an overview about how this value is
determined, and what might be specific about a docker container? Here are
some relevant YARN configuration values (these are available on both the RM
and NM):

yarn.nodemanager.resource.cpu-vcores=8
yarn.nodemanager.resource.memory-mb=16384
yarn.nodemanager.vmem-check-enabled=false
yarn.nodemanager.vmem-pmem-ratio=2.1
yarn.scheduler.minimum-allocation-mb=256
yarn.scheduler.maximum-allocation-mb=16384
yarn.scheduler.minimum-allocation-vcores=1
yarn.scheduler.maximum-allocation-vcores=16

Thanks for the help,
Malcolm

-- 
Malcolm McFarland
Cavulus


Re: CPU allocation when running YARN on Docker

2019-04-15 Thread Malcolm McFarland
Hi Prabhu,

Sorry for the delayed response; it actually was the
maximum-allocation-vcores setting. I interpreted the description for
maximum-allocation-vcores as the per-container setting, when it actually
seems to be the full allocation across the cluster.

Cheers,
Malcolm

On Tue, Apr 2, 2019 at 1:01 AM Prabhu Josephraj 
wrote:

> Hi Malcolm,
>
>   Scheduler sets the max-vcores in RegisterApplicationMasterResponse
> from the queue's configured max value
> (yarn.scheduler.capacity.root..maximum-allocation-vcores) from
> Scheduler Configuration (capacity-
> scheduler.xml / fair-scheduler.xml). If not specified, returns the value
> from Yarn Configuration
> yarn.scheduler.minimum-allocation-vcores. Can you check if the queue where
> Samza job runs is specified with
> maximum-allocation-vcores.
>
> Thanks,
> Prabhu Joseph
>
>
> On Tue, Apr 2, 2019 at 1:19 AM Malcolm McFarland 
> wrote:
>
>> Hey folks,
>>
>> (apologies if this is a duplicate, I don't think my first message went
>> through)
>>
>> I'm running Samza 0.14.1 with YARN 2.6.1 on Docker 18.06.1 in ECS. (I know
>> YARN on Docker is somewhat unorthodox, but it's how the ops team at our
>> company has things setup.) It's running quite well overall -- I have 2
>> resource managers and 3 node managers communicating smoothly.
>>
>> The trouble is occurring with the application container CPU allocation.
>> I'm
>> running Samza on this cluster, and although it starts up just fine and
>> works fine when it requests 1 CPU/container, it won't start any container
>> with more than 1 core. I see this message in the Samza log: "Got AM
>> register response. The YARN RM supports container requests with max-mem:
>> 16384, max-cpu: 1".
>>
>> Looking at the source code, this is derived from the
>> RegisterApplicationMasterResponse returned from
>> AMRMClientAsync.registerApplicationMaster(). I'm trying to trace how YARN
>> determines the result for
>> response.getMaximumResourceCapability().getVirtualCores(), and it's a bit
>> difficult. Does anybody have an overview about how this value is
>> determined, and what might be specific about a docker container? Here are
>> some relevant YARN configuration values (these are available on both the
>> RM
>> and NM):
>>
>> yarn.nodemanager.resource.cpu-vcores=8
>> yarn.nodemanager.resource.memory-mb=16384
>> yarn.nodemanager.vmem-check-enabled=false
>> yarn.nodemanager.vmem-pmem-ratio=2.1
>> yarn.scheduler.minimum-allocation-mb=256
>> yarn.scheduler.maximum-allocation-mb=16384
>> yarn.scheduler.minimum-allocation-vcores=1
>> yarn.scheduler.maximum-allocation-vcores=16
>>
>> Thanks for the help,
>> Malcolm
>>
>> --
>> Malcolm McFarland
>> Cavulus
>>
>

-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarl...@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.


[jira] [Created] (HADOOP-16556) Fix some LGTM alerts

2019-09-12 Thread Malcolm Taylor (Jira)
Malcolm Taylor created HADOOP-16556:
---

 Summary: Fix some LGTM alerts
 Key: HADOOP-16556
 URL: https://issues.apache.org/jira/browse/HADOOP-16556
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Malcolm Taylor


LGTM analysis of Hadoop has raised some alerts 
([https://lgtm.com/projects/g/apache/hadoop/?mode=tree).] This issue is to fix 
some of the more straightforward ones.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-11403) Solaris does not support sys_errlist requires use of strerror instead

2014-12-13 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11403:
-

 Summary: Solaris does not support sys_errlist requires use of 
strerror instead
 Key: HADOOP-11403
 URL: https://issues.apache.org/jira/browse/HADOOP-11403
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.5.0, 2.4.0, 2.3.0, 2.2.0
 Environment: Solaris 11.1 (Sparc, Intel), Linux x86
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.6.0


sys_errlist has been removed from Solaris. The new interface is strerror.  
Wherever sys_errlist is accessed we should change to using strerror instead.
We already havea n interface function terror which can contain this 
functionality, so we should use it instead of directly accessing sys_errlist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11623) Native compilation fails on Solaris due to use of syscall function.

2015-02-23 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11623:
-

 Summary: Native compilation fails on Solaris due to use of syscall 
function.
 Key: HADOOP-11623
 URL: https://issues.apache.org/jira/browse/HADOOP-11623
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 2.6.0
 Environment: Solaris 11.2
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.6.0


Solaris does not provide the syscall function. Currently, hadoop has very 
limited use of this function ( only 2 files ). These need to be "ifdeffed" with 
the correct alternatives for Solaris.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11655) Native compilation fails on Solaris due to use of getgrouplist function.

2015-03-01 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11655:
-

 Summary: Native compilation fails on Solaris due to use of 
getgrouplist function.
 Key: HADOOP-11655
 URL: https://issues.apache.org/jira/browse/HADOOP-11655
 Project: Hadoop Common
  Issue Type: Bug
 Environment: Solaris 11.2
Reporter: Malcolm Kavalsky


getgrouplist() does not exist in Solaris, thus preventing compilation of the 
native libraries. 
The easiest solution would be to port this function from Linux or FreeBSD to 
Solaris and add it to the library if compiling for Solaris.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11848) Incorrect arguments to sizeof in DomainSocket.c

2015-04-20 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11848:
-

 Summary: Incorrect arguments to sizeof in DomainSocket.c
 Key: HADOOP-11848
 URL: https://issues.apache.org/jira/browse/HADOOP-11848
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.6.0


Length of buffer to be zeroed using sizeof , should not use the address of the 
structure rather the structure itself.

DomainSocket.c line 156

Replace current:
memset(&addr,0,sizeof,(&addr));

With:
memset(&addr, 0, sizeof(addr));



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11952) Native compilation on Solaris fails on Yarn due to use of FTS

2015-05-11 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11952:
-

 Summary: Native compilation on Solaris fails on Yarn due to use of 
FTS
 Key: HADOOP-11952
 URL: https://issues.apache.org/jira/browse/HADOOP-11952
 Project: Hadoop Common
  Issue Type: Bug
 Environment: Solaris 11.2
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.7.1


Compiling the Yarn Node Manager results in "fts" not found. On Solaris we have 
an alternative ftw with similar functionality.
This is isolated to a single file container-executor.c
Note that this will just fix the compilation error. A more serious issue is 
that Solaris does not support cgroups as Linux does.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11953) Binary flags for NativeIO incorrect on Solaris

2015-05-11 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11953:
-

 Summary: Binary flags for NativeIO incorrect on Solaris
 Key: HADOOP-11953
 URL: https://issues.apache.org/jira/browse/HADOOP-11953
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.5.2, 2.7.0, 2.6.0, 2.4.1, 2.5.0, 2.6.1, 2.8.0, 2.7.1
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.7.1


NativeIO.c has defines for standard input output, (O_CREAT, etc). These are 
different on Solaris (similar to FreeBSD).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11954) Solaris does not support RLIMIT_MEMLOCK as in Linux

2015-05-11 Thread Malcolm Kavalsky (JIRA)
Malcolm Kavalsky created HADOOP-11954:
-

 Summary: Solaris does not support RLIMIT_MEMLOCK as in Linux
 Key: HADOOP-11954
 URL: https://issues.apache.org/jira/browse/HADOOP-11954
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.5.2, 2.7.0, 2.6.0, 2.4.1, 2.3.0, 2.2.0
Reporter: Malcolm Kavalsky
Assignee: Malcolm Kavalsky
 Fix For: 2.7.1


This affects the JNI call to NativeIO_getMemlockLimit0.
We can just return 0, as Windows does which also does not support this feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11655) Native compilation fails on Solaris due to use of getgrouplist function.

2015-05-28 Thread Malcolm Kavalsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Malcolm Kavalsky resolved HADOOP-11655.
---
  Resolution: Fixed
Release Note: Updating Solaris 11.2 to the latest SRU ( 17th Match 2015) 
fixes this issue

> Native compilation fails on Solaris due to use of getgrouplist function.
> 
>
> Key: HADOOP-11655
> URL: https://issues.apache.org/jira/browse/HADOOP-11655
> Project: Hadoop Common
>  Issue Type: Sub-task
> Environment: Solaris 11.2
>    Reporter: Malcolm Kavalsky
>Assignee: Malcolm Kavalsky
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> getgrouplist() does not exist in Solaris, thus preventing compilation of the 
> native libraries. 
> The easiest solution would be to port this function from Linux or FreeBSD to 
> Solaris and add it to the library if compiling for Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)