Re: Solaris Port

Steve Loughran Sat, 13 Dec 2014 04:06:07 -0800

On 12 December 2014 at 04:35, malcolm <malcolm.kaval...@oracle.com> wrote:


> So, turns out that if I had naively changed all calls to terror or
> references to sys_errlist, to using strerror_r, then I would have broken
> code for Windows and HPUX (and possibly other OSes).
>
> If we are to assume that current code runs fine on all platforms (maybe
> even AIX an MacOS, for example), then any change/additions made to the code
> and not ifdeffed appropriately can break on other OSes. On the other hand,
> too many ifdefs can pollute the code source and render it less readable
> (though possibly less important).
>
>
readability is less important that correct functionality. The native
libraries are not the main focus of development, and less people work on
them. That's a strength: those that do are presumably competent, and a
weakness: less people are able to review and devlop it.


> In the general case what are code contributors responsibilities to adding
> code regarding OSes besides Linux ?
> What OSes does jenkins test on ?
>

ASF jenkins currently tests on Debian on a pool of hosts

https://builds.apache.org/view/H-L/view/Hadoop/


There are some windows hosts to which jenkins could also schedule work,
though I've found them problematic (you can't run MiniYARN cluster there,
specifically, due to some FS permissions
https://builds.apache.org/job/slider-develop-windows/ )

There is a Solaris host in the jenkins pool, adding a hadoop-solaris would
be the best way of guaranteeing that regressions in the code which broke
solaris would be picked up early.


> I guess maintainers of code on non-tested platforms are responsible for
> their own testing ?
>
>
My colleagues run jenkins builds and tests on our other supported
platforms, RHEL/CentOS and Windows; regressions which break these will be
caught relatively rapidly. As an example from yesterday,
https://issues.apache.org/jira/browse/HDFS-7514 , "TestTextCommand fails on
Windows". Path setup is always a problem there, usually in tests.

I assume others are doing similar things, especially those using Apache
Bigtop as a test suite for alternative filesystems. The tests there are
what you should really try doing internally.

OS/X is one of the two primary dev platforms, any regression there will be
picked up by the engineers within 24h


> How do we avoid the ping-pong effect, i.e. I make a generic change to code
> which breaks on Windows, then the Windows maintainer reverts changes to
> break on Solaris for example ? Or does this not happen in actuality ?
>

Ideally regressions should be picked up during the review phase

If something does slip through, then the following is likely to take place

   1. A new JIRA is created, with a "caused-by" link pointing to the patch
   which caused the problem.
   2. Either: the patch at fault is reverted, pending a revised, fixed
   version.
   3. A patch is applied to fix the problem. This is likely if the problem
   is trivial.

There isn't likely to be a ping-pong effect as if a patch breaks
compatibility and is reverted. It then effectively responsibility of the
authors of the original patch to fix things and resubmit a working patch.
And, after the first failure, reviews will be a lot stricter.

More succinctly: *stuff which causes regressions doesn't get in*

One thing you can do is set up a Windows VM for build and test -we have the
instructions, I have such a VM to hand. I'd also recommend you have a
kerberized linux VM to test what happens there.

You also have the freedom to review other submitted patches and highlight
where you think they will cause problems on Solaris. Unless/until you
become a committer you don't get the right to say +1 and get the patch in,
but you can certainly flag up where a change isn't compatible, then work
with the patch developers to nurture it into shape.


Where we do have trouble in the ASF is
 -nonstandard JVMs, i.e. anything other than oracle and openjdk. You should
not have that problem.
 -native filesystems that don't get used much (yet)
 -non-HDFS filesystems. This is why we don't support any except the object
store libraries bundled in the hadoop source tree, object stores anyone
with the relevant cloud logins can test from their desk.
 -nonstandard networking setups. Anything to test there and find problems
is welcome.

The shell scripts are also undertested and only intermittently maintained
—there's been recent work there in HADOOP-9902(?)  which is a great
improvement to trunk. If you can help test the CLI on your OS, that will
save you support calls.

-Steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Solaris Port

Reply via email to