I'm afraid I don't agree that we're anywhere near coming to a
consensus, or even that we're all agreeing on what we're discussing.
(I do totally agree that the discussion itself has been awesome both
in tone and content, though).

As Tim brought up and I mentioned, the Board is not big on subprojects
right now, for a lot of the reasons that flowed from Chris' points.
What role would the current Samza PMC have in Kafka? What role would
the Kafka PMC have over the Samza code?  Why would some members of the
Samza PMC be rolled into the Kafka, but not others?  These types of
questions are where the whole Community Over Code ethos comes from;
it's better to have a happy community than the absolute, subjective
best bit of code in the repo.  As a member of both communities, I can
say that he Kafka and Samza cultures and communities are significantly
different.  For example, Kafka has very, very strict procedures for
code contributions.  Samza does not.  One might be better than the
other, but again, it's down to community and asking the Samza
community to integrate the Kafka approach is a bigger issue than
asking for another project to add some code to its repo.

The Board will care about the communities, not the code and most of
this discussion has been nearly entirely focused on the code.

Additionally, except for Jay (and myself, but I'm pretty Kafka
inactive), there has been no input from the Kafka community.  Even if
we did have full agreement on the Samza side for Option C ("Hey,
Samza! FYI, Kafka does streaming now!"), the Kafka community has no
need to agree or participate.

Personally, my preference would be for a Samza 2.0 approach.  There
are a lot of lessons learned in the project so far and, with a
willingness to break APIs, we could improve dramatically in terms of
ease of use, supported execution environments and support for other
types of input and output methods.  It may be that the community
splits in this regard, with some contributing to a new streaming
library in Kafka and others contributing to a continuation of the
current Samza approach.  From an ASF approach, this would be a
perfectly acceptable outcome because, again, the communities would be
quiet harmonious.

-Jakob

On 12 July 2015 at 17:54, Chris Riccomini <criccom...@apache.org> wrote:
> Given that Jay, Martin, and I seem to be aligning fairly closely, I think
> we should start with:
>
> 1. [community] Make Samza a subproject of Kafka.
> 2. [community] Make all Samza PMC/committers committers of the subproject.
> 3. [community] Migrate Samza's website/documentation into Kafka's.
> 4. [code] Have the Samza community and the Kafka community start a
> from-scratch reboot together in the new Kafka subproject. We can
> borrow/copy &  paste significant chunks of code from Samza's code base.
> 5. [code] The subproject would intentionally eliminate support for both
> other streaming systems and all deployment systems.
> 6. [code] Attempt to provide a bridge from our SystemConsumer to KIP-26
> (copy cat)
> 7. [code] Attempt to provide a bridge from the new subproject's processor
> interface to our legacy StreamTask interface.
> 8. [code/community] Sunset Samza as a TLP when we have a working Kafka
> subproject that has a fault-tolerant container with state management.
>
> It's likely that (6) and (7) won't be fully drop-in. Still, the closer we
> can get, the better it's going to be for our existing community.
>
> One thing that I didn't touch on with (2) is whether any Samza PMC members
> should be rolled into Kafka PMC membership as well (though, Jay and Jakob
> are already PMC members on both). I think that Samza's community deserves a
> voice on the PMC, so I'd propose that we roll at least a few PMC members
> into the Kafka PMC, but I don't have a strong framework for which people to
> pick.
>
> Before (8), I think that Samza's TLP can continue to commit bug fixes and
> patches as it sees fit, provided that we openly communicate that we won't
> necessarily migrate new features to the new subproject, and that the TLP
> will be shut down after the migration to the Kafka subproject occurs.
>
> Jakob, I could use your guidance here about about how to achieve this from
> an Apache process perspective (sorry).
>
> * Should I just call a vote on this proposal?
> * Should it happen on dev or private?
> * Do committers have binding votes, or just PMC?
>
> Having trouble finding much detail on the Apache wikis. :(

Reply via email to