I am going to try to minimize my reaction to Moon’s e-mail.  

The tl;dr is this:   

The reason we are having this discussion now is that active users of the PR — 
which now has its own user base — went public to complain about this. 

The PR has been tested by an active user base for more than three months.  
No-one has been able to identify any specific actual licensing problem, and the 
PR was prepared based on an extensive, careful review of the relevant licensing 
issues and after contacting the relevant people.  

It is not an explanation for someone who has been ignoring my “how can I move 
this forward…” emails for three months to point the finger and say I didn’t 
contact the right person or file the right report. 

The burden for providing an explanation for the inaction is on the PMCC at this 
point. 


I'm sorry, but the other PRs are passing CI. If it's problem on Zeppelin core, 
why do you think other PRs are passing CI?
They’re not!  I often see comments on PRs to just ignore that CI is failing. 

One of the most common reasons this PR fails CI, is CI times-out downloading 
Spark to install.  How could that possibly be caused by the PR?

It looks to me like the only PRs with changes to the relevant parts of the code 
— the SparkInterpreter — are being made by the person who wrote the testing 
suite.

So, that would explain why some other PRs pass CI:  Neither the 
SparkInterpreter nor the testing suite are stable or robust, but since the PRs 
are coming from the person who wrote both… 

And let's say Zeppelin core has problem and that makes your PR fails on CI 
test. That's possible. But it still does not mean we can merge it with CI 
failing.

It means you should be working with me to figure out why the CI is failing.

This PR has been tested by an active user base for the past three months.  If 
CI is continuing to fail, and dozens of hours of effort have not resolved the 
CI issues, then it is time to start considering whether the testing suite is 
part of the problem. 

The level of defensiveness about the CI and SparkInterpreter is not helping to 
resolve these issues. 

If you think it's problem on Zeppelin core, then file an issue that reproduce 
the problem on Zeppelin core, that might be more efficient than keep trying 
yourself.
I contacted you numerous times about such issues...

In my interpretation, KnitRInterpreter is not an optional feature while it is 
always enabled when compiling Zeppelin and always enabled when running 
Zeppelin. And it requires dynamically linked GPL library on runtime. (yes it 
will fail when no KnitR is installed on the system)

Its not always enabled. 
It is not dynamically linked at runtime. 
It will not fail when knitr is missing.  If knitr is not present, the repl 
interpreter starts and a note is written to to the log that the knitr 
interpreter isn’t available because knitr is not present.  

no Apache code can ever call a shell script, on the purpose of dynamic linking 
with GPL library.
You misunderstand. 

The *shell* is GPL'd.  

Is Zeppelin “linked" against the GPL’d shell because Zeppelin depends on a 
shell script to launch?  

Obviously not.

The interaction with R in the PR is the same. 

I was guessing SparkR can be still in Apache License even if it is depends on 
R. Because of GPL licensed compiler generated output is not GPL license. and R 
is sort of compiler. If you can get answer from Spark community how SparkR get 
managed to stay in Apache License while R is GPL, the answer might help.
The description of SparkR is not accurate in any respect.  (Do you think SparkR 
is not talking to GPL-licensed libraries?)

I don’t see that any genuine issue is being raised here.   

If there is an issue, the burden is on you to identify it. 

If i give you one suggestion, Zeppelin committers sometimes ask rebase the 
contribution branch for some reason. It is not the really the best practice, 
but still okay while most contributions are not including large code base 
changes
However, your one, has more than 4000 lines of code change. If you rebase  then 
review should be started from the beginning, again. So you might want to 
minimize the rebase your branch.

Are you actually complaining that the problem is that I rebased the code during 
the three-month period when no-one looked at it and Zeppelin went through a 
release?

I cannot take it seriously when you say things like this.  

Having to “start from the beginning” cannot be a problem if you never actually 
started the first time...



From: moon soo Lee <[email protected]>
Reply: moon soo Lee <[email protected]>
Date: December 1, 2015 at 4:42:06 AM
To: Amos B. Elberg <[email protected]>, [email protected] 
<[email protected]>
Subject:  Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull 
request: R Interpreter for Zeppelin  

On Tue, Dec 1, 2015 at 4:40 PM Amos B. Elberg <[email protected]> wrote:
Thank you, Cos.  

I’d like to briefly address the issues raised by Moon:

1. This PR does not passes CI 
The CI fails on core Zeppelin, *not* code in this PR.  

I’ve been seeking assistance on this since August. 

The most common reason is that SparkInterpreter is unable to launch Spark and 
open a Spark Backend.  This is necessary to test the PR.

60+ hours, has been spent adapting and re-basing when the SparkInterpreter 
architecture changed and broke the PR’s *tests.* 


I'm sorry, but the other PRs are passing CI. If it's problem on Zeppelin core, 
why do you think other PRs are passing CI?

And let's say Zeppelin core has problem and that makes your PR fails on CI 
test. That's possible. But it still does not mean we can merge it with CI 
failing.

If you think it's problem on Zeppelin core, then file an issue that reproduce 
the problem on Zeppelin core, that might be more efficient than keep trying 
yourself.


2. Not 100% sure this PR has no license issue. (about KniteR) 
What license problem *specifically* do you believe may exist?  

In preparing the PR, I:  

* Reviewed the Apache policy in detail. 

* Contacted the FSF to ask their interpretation of the “linking” provisions of 
the GPL license.

* Reviewed how other Apache software deals with this issue (e.g., Spark talks 
to R, which is GPL'd). 

* No necessary *dependencies* of the PR have license conflicts.  In several 
cases, I contacted package authors who agreed to re-issue their packages under 
Apache-compatible licenses.  (Usually I had to do a bit of coding in exchange…)

* Where the license had to stay GPL, the packages are *not necessary* and *not 
dependencies.*  If the optional packages are present, they will be used to 
enable additional functionality.  Knitr is an example.  The PR will compile and 
run fine without knitr.  If knitr is available (it is part of the most common R 
distribution), the PR will enable the knitr interpreter.  

* This is exactly how this issue is addressed through the Apache ecosystem. 
The tl;dr is this:   When Apache code is written to talk to libraries that may 
or may not be present on the user’s system, or where it talks to an API but is 
agnostic about implementation, that is not “linking” in a way that implicate 
the anti-linking provision of the GPL.   

Otherwise, no Apache code could ever call a shell script!  Let alone run on 
Linux, or talk to R. 


I'm not a legal expert. So following could be wrong.

In my interpretation, KnitRInterpreter is not an optional feature while it is 
always enabled when compiling Zeppelin and always enabled when running 
Zeppelin. And it requires dynamically linked GPL library on runtime. (yes it 
will fail when no KnitR is installed on the system)

And of course, no Apache code can ever call a shell script, on the purpose of 
dynamic linking with GPL library.

I was guessing SparkR can be still in Apache License even if it is depends on 
R. Because of GPL licensed compiler generated output is not GPL license. and R 
is sort of compiler.

If you can get answer from Spark community how SparkR get managed to stay in 
Apache License while R is GPL, the answer might help.


3. Need more time to review. 
Has any reviewer has downloaded the PR or run the demo notebook?  (Which is 
there for the benefit of reviewers, and isn’t intended to go in final 
distribution.)  

How many +1 comments from users would you like to see? 

How much time do you believe is required?


It all depends on when CI is going to pass, when license problem is going to be 
cleared, and when a committer willing to review and responsible to commit your 
contribution.


1. Large code base change 
Large code base changes require coordination and cooperation.  This PR 
necessarily implicates the build scripts, testing code, the SparkInterpreter, 
etc.  

I have been seeking to coordinate since August. 

I continue to stand ready to do so. 

-Amos   


If i give you one suggestion, Zeppelin committers sometimes ask rebase the 
contribution branch for some reason. It is not the really the best practice, 
but still okay while most contributions are not including large code base 
changes.

However, your one, has more than 4000 lines of code change. If you rebase  then 
review should be started from the beginning, again. So you might want to 
minimize the rebase your branch.

Thanks,
moon

 
From: moon soo Lee <[email protected]>
Reply: [email protected] <[email protected]>
Date: December 1, 2015 at 1:34:19 AM
To: [email protected] <[email protected]>
Subject:  Re: contributions impasse. Was: [GitHub] incubator-zeppelin pull 
request: R Interpreter for Zeppelin

Hi Cos,

Thanks for opening a discussion.
My answer about 'Why this PR is open for three months' is

1. This PR does not passes CI
2. Not 100% sure this PR has no license issue. (about KniteR)
3. Need more time to review.

It's my personal answer, other committers may have different opinion.


I think the question should be generalized. Because this PR is not the only
PR that is in impasse. There're more. For example

Here's some examples that PR is not been merged.
https://github.com/apache/incubator-zeppelin/pull/53,
https://github.com/apache/incubator-zeppelin/pull/60
and so on.

I can categorize the cases, based on experience of involving Zeppelin
community from the beginning.

1. Large code base change

When contribution has large code base changes, it tend to take more time to
review and merged. Normally, most contributions merged in 1day~1 week.
But some contribution has large code base changes take 2~4 weeks. Few
contribution that has very large code base change take months.

2. Communication lost

Sometimes, committer is not responding, or contributor is not responding.

3. Opinion diverges

For some changes, included in contribution. When people have different
opinion and it continue to diverges, those PRs are not been merged.


I think having a guide such as ping other committer when a committer is not
responding, and divide contribution into small peaces if possible, would
help most of the cases.
Of course committer have to pay attention more to the contribution and
helping people. That's the first one.

Thanks,
moon

On Tue, Dec 1, 2015 at 1:54 PM Konstantin Boudnik <[email protected]> wrote:

> To make sure we're on the same page, here are two threads that I found
> related
> to this topic.
>
> Thread 1:
> Subject: R?
> Started on: July 1, 2015
>
> Thread 2:
> Subject: [GitHub] incubator-zeppelin pull request: R Interpreter for
> Zeppelin
> Started on: August 13, 2015
>
> If you want to fetch these from our archive send emails to
> [email protected]
> [email protected]
>
> Cos
>
> On Mon, Nov 30, 2015 at 06:27PM, Konstantin Boudnik wrote:
> > Guys,
> >
> > While catching up on my emails from the last a couple of weeks, this
> thread
> > caught my attention. I am not normally paying much attention to the code
> > reviews traffic from GH, but this one is pretty different as it spans
> three
> > months and counting.
> >
> > First, here are my five cents:
> > - r/R/rzeppelin/LICENSE is wrong: if the code is aimed to be
> contributed to
> > an ASF project this file should simply contain ASL2 text, like in [1]
> > - r/pom.xml perhaps shouldn't contain a separate <developers> section,
> but
> > Zeppelin might have different guidelines on it. Say, Bigtop doesn't
> > maintain this in the source code, but we have the list of all the
> > committers on the project's site [2] Every new committer's first
> commit is
> > to update the team page ;)
> > - comments like in
> r/src/main/java/org/apache/zeppelin/rinterpreter/KnitR.java
> >
> > +/**
> > + * Created by aelberg on 7/28/15.
> > + */
> >
> > is better to be removed. It has been already discussed here [3]. And
> the
> > initial discussion on the topic [4] was linked as well
> > - same goes to r/R/rzeppelin/DESCRIPTION. I am not sure if this is
> R-specific
> > stuff - I have no idea about R, honestly, but
> >
> > +License: GPL (>= 2) | BSD_3_clause + file LICENSE
> > ...
> > +Author: David B. Dahl
> >
> > shouldn't be here, IMO. Normally, if some additional licenses are
> used,
> > they have to be listed in the top-level NOTICE file (already there).
> >
> > - I am not going to offer any comments on the technical merits of the
> patch,
> > as I haven't tried it personally. However, I don't see any serious
> > technical objections to the functionality in question.
> >
> > So, the question is - why the PR is open for three months? I hasn't been
> able
> > to get a clear answer. What I found out though is pretty unsettling,
> really.
> > The communication on the topic is almost non-existent, except for this
> sparse
> > and bitter thread in the GH.
> >
> > I would love to hear from the committers what's stopping the acceptance
> of the
> > code, besides of the minor issues I've mentioned earlier? What are the
> reasons for it?
> > Is there anything wrong with it?
> > One of the responsibilities of the committers is to make sure the
> > contributions are reviewed; new contributors are guided and do
> understand how
> > the project ticks. The easy feedback flow attracts new people, allowing
> the
> > community to grow and thrive.
> >
> > I am asking _explicitely_ not to re-start the bickering I have already
> > seen. At this point I am interested in the purely technical side of this.
> >
> > [1] https://github.com/apache/bigtop/blob/master/LICENSE
> > [2] http://bigtop.apache.org/team-list.html
> > [3]
> http://apache-nifi-developer-list.39713.n7.nabble.com/author-tags-td1335.html
> > [4] http://s.apache.org/iZl
> >
> > With regards,
> > Cos
> >
> > On Mon, Nov 16, 2015 at 11:06PM, elbamos wrote:
> > > Github user elbamos commented on the pull request:
> > >
> > >
> https://github.com/apache/incubator-zeppelin/pull/208#issuecomment-157203411
> > >
> > > The current push should resolve some issues with changes in the
> > > Spark-Zeppelin interface that had created issues for users, as
> well as
> > > support for 1.5.1.
> > >
> > > Knitr support is improved, and the reason for a separate knitr
> interpreter may be clearer now.
> > >
> > > The amount of code borrowed from rscala is reduced.
> > >
> > > I did not address issues with @author tags, or files under the R/
> > > folder. The reason is, to be blunt, I don't understand what the
> precise
> > > concerns actually are.
> > >
> > > Please note that I changed .travis.yml to only use spark 1.4 and
> later.
> > > I'm sure there is a better way to do it, but I'm just not in a
> position
> > > to try to figure out the entire Zeppelin build process.
> > >
> > > Integrating this with the rest of zeppelin is going to take some
> work
> > > regarding pom's, travis, and so forth. I can do a lot of that,
> but I'm
> > > going to need to discuss it with the people who have been "owning"
> those
> > > files. There are just too many moving pieces here.
> > >
> > > Because of the size of this (which is, unfortunately, necessary),
> > > posting here is probably not the most efficient way. That is also
> true
> > > because certain people chose to use this PR as a forum to air other
> > > issues. Therefore, it would be better if reviewers e-mail me
> directly.
>
>
>

Reply via email to