Hi All,
FYI :
We will be going ahead with the present approach, will merge by tomorrow EOD. 
Considering no one has objections.
Thanx Everyone!!!

-Ayush

> On 07-Jan-2020, at 9:22 PM, Brahma Reddy Battula <bra...@apache.org> wrote:
> 
> Hi Sree vaddi,Owen,stack,Duo Zhang,
> 
> We can move forward based on your comments, just waiting for your
> reply.Hope all of your comments answered..(unification we can think
> parallel thread as Vinay mentioned).
> 
> 
> 
> On Mon, 6 Jan 2020 at 6:21 PM, Vinayakumar B <vinayakum...@apache.org>
> wrote:
> 
>> Hi Sree,
>> 
>>> apache/hadoop-thirdparty, How would it fit into ASF ? As an Incubating
>> Project ? Or as a TLP ?
>>> Or as a new project definition ?
>> As already mentioned by Ayush, this will be a subproject of Hadoop.
>> Releases will be voted by Hadoop PMC as per ASF process.
>> 
>> 
>>> The effort to streamline and put in an accepted standard for the
>> dependencies that require shading,
>>> seems beyond the siloed efforts of hadoop, hbase, etc....
>> 
>>> I propose, we bring all the decision makers from all these artifacts in
>> one room and decide best course of action.
>>> I am looking at, no projects should ever had to shade any artifacts
>> except as an absolute necessary alternative.
>> 
>> This is the ideal proposal for any project. But unfortunately some projects
>> takes their own course based on need.
>> 
>> In the current case of protobuf in Hadoop,
>>    Protobuf upgrade from 2.5.0 (which is already EOL) was not taken up to
>> avoid downstream failures. Since Hadoop is a platform, its dependencies
>> will get added to downstream projects' classpath. So any change in Hadoop's
>> dependencies will directly affect downstreams. Hadoop strictly follows
>> backward compatibility as far as possible.
>>    Though protobuf provides wire compatibility b/w versions, it doesnt
>> provide compatibility for generated sources.
>>    Now, to support ARM protobuf upgrade is mandatory. Using shading
>> technique, In Hadoop internally can upgrade to shaded protobuf 3.x and
>> still have 2.5.0 protobuf (deprecated) for downstreams.
>> 
>> This shading is necessary to have both versions of protobuf supported.
>> (2.5.0 (non-shaded) for downstream's classpath and 3.x (shaded) for
>> hadoop's internal usage).
>> And this entire work to be done before 3.3.0 release.
>> 
>> So, though its ideal to make a common approach for all projects, I suggest
>> for Hadoop we can go ahead as per current approach.
>> We can also start the parallel effort to address these problems in a
>> separate discussion/proposal. Once the solution is available we can revisit
>> and adopt new solution accordingly in all such projects (ex: HBase, Hadoop,
>> Ratis).
>> 
>> -Vinay
>> 
>>> On Mon, Jan 6, 2020 at 12:39 AM Ayush Saxena <ayush...@gmail.com> wrote:
>>> 
>>> Hey Sree
>>> 
>>>> apache/hadoop-thirdparty, How would it fit into ASF ? As an Incubating
>>>> Project ? Or as a TLP ?
>>>> Or as a new project definition ?
>>>> 
>>> A sub project of Apache Hadoop, having its own independent release
>> cycles.
>>> May be you can put this into the same column as ozone or as
>>> submarine(couple of months ago).
>>> 
>>> Unifying for all, seems interesting but each project is independent and
>> has
>>> its own limitations and way of thinking, I don't think it would be an
>> easy
>>> task to bring all on the same table and get them agree to a common stuff.
>>> 
>>> I guess this has been into discussion since quite long, and there hasn't
>>> been any other alternative suggested. Still we can hold up for a week, if
>>> someone comes up with a better solution, else we can continue in the
>>> present direction.
>>> 
>>> -Ayush
>>> 
>>> 
>>> 
>>> On Sun, 5 Jan 2020 at 05:03, Sree Vaddi <sree_at_ch...@yahoo.com
>> .invalid>
>>> wrote:
>>> 
>>>> apache/hadoop-thirdparty, How would it fit into ASF ? As an Incubating
>>>> Project ? Or as a TLP ?
>>>> Or as a new project definition ?
>>>> 
>>>> The effort to streamline and put in an accepted standard for the
>>>> dependencies that require shading,seems beyond the siloed efforts of
>>>> hadoop, hbase, etc....
>>>> 
>>>> I propose, we bring all the decision makers from all these artifacts in
>>>> one room and decide best course of action.I am looking at, no projects
>>>> should ever had to shade any artifacts except as an absolute necessary
>>>> alternative.
>>>> 
>>>> 
>>>> Thank you./Sree
>>>> 
>>>> 
>>>> 
>>>>    On Saturday, January 4, 2020, 7:49:18 AM PST, Vinayakumar B <
>>>> vinayakum...@apache.org> wrote:
>>>> 
>>>> Hi,
>>>> Sorry for the late reply,.
>>>>>>> To be exact, how can we better use the thirdparty repo? Looking at
>>>> HBase as an example, it looks like everything that are known to break a
>>> lot
>>>> after an update get shaded into the hbase-thirdparty artifact: guava,
>>>> netty, ... etc.
>>>> Is it the purpose to isolate these naughty dependencies?
>>>> Yes, shading is to isolate these naughty dependencies from downstream
>>>> classpath and have independent control on these upgrades without
>> breaking
>>>> downstreams.
>>>> 
>>>> First PR https://github.com/apache/hadoop-thirdparty/pull/1 to create
>>> the
>>>> protobuf shaded jar is ready to merge.
>>>> 
>>>> Please take a look if anyone interested, will be merged may be after
>> two
>>>> days if no objections.
>>>> 
>>>> -Vinay
>>>> 
>>>> 
>>>> On Thu, Oct 10, 2019 at 3:30 AM Wei-Chiu Chuang <weic...@apache.org>
>>>> wrote:
>>>> 
>>>>> Hi I am late to this but I am keen to understand more.
>>>>> 
>>>>> To be exact, how can we better use the thirdparty repo? Looking at
>>> HBase
>>>>> as an example, it looks like everything that are known to break a lot
>>>> after
>>>>> an update get shaded into the hbase-thirdparty artifact: guava,
>> netty,
>>>> ...
>>>>> etc.
>>>>> Is it the purpose to isolate these naughty dependencies?
>>>>> 
>>>>> On Wed, Oct 9, 2019 at 12:38 PM Vinayakumar B <
>> vinayakum...@apache.org
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> I have updated the PR as per @Owen O'Malley <owen.omal...@gmail.com
>>> 
>>>>>> 's suggestions.
>>>>>> 
>>>>>>   i. Renamed the module to 'hadoop-shaded-protobuf37'
>>>>>>   ii. Kept the shaded package to 'o.a.h.thirdparty.protobuf37'
>>>>>> 
>>>>>> Please review!!
>>>>>> 
>>>>>> Thanks,
>>>>>> -Vinay
>>>>>> 
>>>>>> 
>>>>>> On Sat, Sep 28, 2019 at 10:29 AM 张铎(Duo Zhang) <
>> palomino...@gmail.com
>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> For HBase we have a separated repo for hbase-thirdparty
>>>>>>> 
>>>>>>> https://github.com/apache/hbase-thirdparty
>>>>>>> 
>>>>>>> We will publish the artifacts to nexus so we do not need to
>> include
>>>>>>> binaries in our git repo, just add a dependency in the pom.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-protobuf
>>>>>>> 
>>>>>>> 
>>>>>>> And it has its own release cycles, only when there are special
>>>>>> requirements
>>>>>>> or we want to upgrade some of the dependencies. This is the vote
>>>> thread
>>>>>> for
>>>>>>> the newest release, where we want to provide a shaded gson for
>> jdk7.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://lists.apache.org/thread.html/f12c589baabbc79c7fb2843422d4590bea982cd102e2bd9d21e9884b@%3Cdev.hbase.apache.org%3E
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks.
>>>>>>> 
>>>>>>> Vinayakumar B <vinayakum...@apache.org> 于2019年9月28日周六 上午1:28写道:
>>>>>>> 
>>>>>>>> Please find replies inline.
>>>>>>>> 
>>>>>>>> -Vinay
>>>>>>>> 
>>>>>>>> On Fri, Sep 27, 2019 at 10:21 PM Owen O'Malley <
>>>>>> owen.omal...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> I'm very unhappy with this direction. In particular, I don't
>>> think
>>>>>> git
>>>>>>> is
>>>>>>>>> a good place for distribution of binary artifacts.
>> Furthermore,
>>>> the
>>>>>> PMC
>>>>>>>>> shouldn't be releasing anything without a release vote.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Proposed solution doesnt release any binaries in git. Its
>>> actually a
>>>>>>>> complete sub-project which follows entire release process,
>>> including
>>>>>> VOTE
>>>>>>>> in public. I have mentioned already that release process is
>>> similar
>>>> to
>>>>>>>> hadoop.
>>>>>>>> To be specific, using the (almost) same script used in hadoop to
>>>>>> generate
>>>>>>>> artifacts, sign and deploy to staging repository. Please let me
>>> know
>>>>>> If I
>>>>>>>> am conveying anything wrong.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> I'd propose that we make a third party module that contains
>> the
>>>>>>> *source*
>>>>>>>>> of the pom files to build the relocated jars. This should
>>>>>> absolutely be
>>>>>>>>> treated as a last resort for the mostly Google projects that
>>>>>> regularly
>>>>>>>>> break binary compatibility (eg. Protobuf & Guava).
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Same has been implemented in the PR
>>>>>>>> https://github.com/apache/hadoop-thirdparty/pull/1. Please
>> check
>>>> and
>>>>>> let
>>>>>>>> me
>>>>>>>> know If I misunderstood. Yes, this is the last option we have
>>> AFAIK.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> In terms of naming, I'd propose something like:
>>>>>>>>> 
>>>>>>>>> org.apache.hadoop.thirdparty.protobuf2_5
>>>>>>>>> org.apache.hadoop.thirdparty.guava28
>>>>>>>>> 
>>>>>>>>> In particular, I think we absolutely need to include the
>> version
>>>> of
>>>>>> the
>>>>>>>>> underlying project. On the other hand, since we should not be
>>>>>> shading
>>>>>>>>> *everything* we can drop the leading com.google.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> IMO, This naming convention is easy for identifying the
>> underlying
>>>>>>> project,
>>>>>>>> but  it will be difficult to maintain going forward if
>> underlying
>>>>>> project
>>>>>>>> versions changes. Since thirdparty module have its own releases,
>>>> each
>>>>>> of
>>>>>>>> those release can be mapped to specific version of underlying
>>>> project.
>>>>>>> Even
>>>>>>>> the binary artifact can include a MANIFEST with underlying
>> project
>>>>>>> details
>>>>>>>> as per Steve's suggestion on HADOOP-13363.
>>>>>>>> That said, if you still prefer to have project number in
>> artifact
>>>> id,
>>>>>> it
>>>>>>>> can be done.
>>>>>>>> 
>>>>>>>> The Hadoop project can make releases of  the thirdparty module:
>>>>>>>>> 
>>>>>>>>> <dependency>
>>>>>>>>> <groupId>org.apache.hadoop</groupId>
>>>>>>>>> <artifactId>hadoop-thirdparty-protobuf25</artifactId>
>>>>>>>>> <version>1.0</version>
>>>>>>>>> </dependency>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Note that the version has to be the hadoop thirdparty release
>>>> number,
>>>>>>> which
>>>>>>>>> is part of why you need to have the underlying version in the
>>>>>> artifact
>>>>>>>>> name. These we can push to maven central as new releases from
>>>>>> Hadoop.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Exactly, same has been implemented in the PR. hadoop-thirdparty
>>>> module
>>>>>>> have
>>>>>>>> its own releases. But in HADOOP Jira, thirdparty versions can be
>>>>>>>> differentiated using prefix "thirdparty-".
>>>>>>>> 
>>>>>>>> Same solution is being followed in HBase. May be people involved
>>> in
>>>>>> HBase
>>>>>>>> can add some points here.
>>>>>>>> 
>>>>>>>> Thoughts?
>>>>>>>>> 
>>>>>>>>> .. Owen
>>>>>>>>> 
>>>>>>>>> On Fri, Sep 27, 2019 at 8:38 AM Vinayakumar B <
>>>>>> vinayakum...@apache.org
>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>>   I wanted to discuss about the separate repo for thirdparty
>>>>>>>> dependencies
>>>>>>>>>> which we need to shaded and include in Hadoop component's
>> jars.
>>>>>>>>>> 
>>>>>>>>>>   Apologies for the big text ahead, but this needs clear
>>>>>>> explanation!!
>>>>>>>>>> 
>>>>>>>>>>   Right now most needed such dependency is protobuf.
>> Protobuf
>>>>>>>> dependency
>>>>>>>>>> was not upgraded from 2.5.0 onwards with the fear that
>>> downstream
>>>>>>>> builds,
>>>>>>>>>> which depends on transitive dependency protobuf coming from
>>>>>> hadoop's
>>>>>>>> jars,
>>>>>>>>>> may fail with the upgrade. Apparently protobuf does not
>>> guarantee
>>>>>>> source
>>>>>>>>>> compatibility, though it guarantees wire compatibility
>> between
>>>>>>> versions.
>>>>>>>>>> Because of this behavior, version upgrade may cause breakage
>> in
>>>>>> known
>>>>>>>> and
>>>>>>>>>> unknown (private?) downstreams.
>>>>>>>>>> 
>>>>>>>>>>   So to tackle this, we came up the following proposal in
>>>>>>> HADOOP-13363.
>>>>>>>>>> 
>>>>>>>>>>   Luckily, As far as I know, no APIs, either public to user
>> or
>>>>>>> between
>>>>>>>>>> Hadoop processes, is not directly using protobuf classes in
>>>>>>> signatures.
>>>>>>>>>> (If
>>>>>>>>>> any exist, please let us know).
>>>>>>>>>> 
>>>>>>>>>>   Proposal:
>>>>>>>>>>   ------------
>>>>>>>>>> 
>>>>>>>>>>   1. Create a artifact(s) which contains shaded
>> dependencies.
>>>> All
>>>>>>> such
>>>>>>>>>> shading/relocation will be with known prefix
>>>>>>>>>> **org.apache.hadoop.thirdparty.**.
>>>>>>>>>>   2. Right now protobuf jar (ex:
>>>>>>>> o.a.h.thirdparty:hadoop-shaded-protobuf)
>>>>>>>>>> to start with, all **com.google.protobuf** classes will be
>>>>>> relocated
>>>>>>> as
>>>>>>>>>> **org.apache.hadoop.thirdparty.com.google.protobuf**.
>>>>>>>>>>   3. Hadoop modules, which needs protobuf as dependency,
>> will
>>>> add
>>>>>>> this
>>>>>>>>>> shaded artifact as dependency (ex:
>>>>>>>>>> o.a.h.thirdparty:hadoop-shaded-protobuf).
>>>>>>>>>>   4. All previous usages of "com.google.protobuf" will be
>>>>>> relocated
>>>>>>> to
>>>>>>>>>> "org.apache.hadoop.thirdparty.com.google.protobuf" in the
>> code
>>>> and
>>>>>>> will
>>>>>>>> be
>>>>>>>>>> committed. Please note, this replacement is One-Time directly
>>> in
>>>>>>> source
>>>>>>>>>> code, NOT during compile and package.
>>>>>>>>>>   5. Once all usages of "com.google.protobuf" is relocated,
>>> then
>>>>>>> hadoop
>>>>>>>>>> dont care about which version of original  "protobuf-java" is
>>> in
>>>>>>>>>> dependency.
>>>>>>>>>>   6. Just keep "protobuf-java:2.5.0" in dependency tree not
>> to
>>>>>> break
>>>>>>>> the
>>>>>>>>>> downstreams. But hadoop will be originally using the latest
>>>>>> protobuf
>>>>>>>>>> present in "o.a.h.thirdparty:hadoop-shaded-protobuf".
>>>>>>>>>> 
>>>>>>>>>>   7. Coming back to separate repo, Following are most
>>>> appropriate
>>>>>>>> reasons
>>>>>>>>>> of keeping shaded dependency artifact in separate repo
>> instead
>>> of
>>>>>>>>>> submodule.
>>>>>>>>>> 
>>>>>>>>>>     7a. These artifacts need not be built all the time. It
>>> needs
>>>>>> to
>>>>>>> be
>>>>>>>>>> built only when there is a change in the dependency version
>> or
>>>> the
>>>>>>> build
>>>>>>>>>> process.
>>>>>>>>>>     7b. If added as "submodule in Hadoop repo",
>>>>>>>> maven-shade-plugin:shade
>>>>>>>>>> will execute only in package phase. That means, "mvn compile"
>>> or
>>>>>> "mvn
>>>>>>>>>> test-compile" will not be failed as this artifact will not
>> have
>>>>>>>> relocated
>>>>>>>>>> classes, instead it will have original classes, resulting in
>>>>>>> compilation
>>>>>>>>>> failure. Workaround, build thirdparty submodule first and
>>> exclude
>>>>>>>>>> "thirdparty" submodule in other executions. This will be a
>>>> complex
>>>>>>>> process
>>>>>>>>>> compared to keeping in a separate repo.
>>>>>>>>>> 
>>>>>>>>>>     7c. Separate repo, will be a subproject of Hadoop, using
>>> the
>>>>>>> same
>>>>>>>>>> HADOOP jira project, with different versioning prefixed with
>>>>>>>> "thirdparty-"
>>>>>>>>>> (ex: thirdparty-1.0.0).
>>>>>>>>>>     7d. Separate will have same release process as Hadoop.
>>>>>>>>>> 
>>>>>>>>>>   HADOOP-13363 (
>>>>>> https://issues.apache.org/jira/browse/HADOOP-13363)
>>>>>>>> is
>>>>>>>>>> an
>>>>>>>>>> umbrella jira tracking the changes to protobuf upgrade.
>>>>>>>>>> 
>>>>>>>>>>   PR (https://github.com/apache/hadoop-thirdparty/pull/1)
>> has
>>>>>> been
>>>>>>>>>> raised
>>>>>>>>>> for separate repo creation in (HADOOP-16595 (
>>>>>>>>>> https://issues.apache.org/jira/browse/HADOOP-16595)
>>>>>>>>>> 
>>>>>>>>>>   Please provide your inputs for the proposal and review the
>>> PR
>>>>>> to
>>>>>>>>>> proceed with the proposal.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>   -Thanks,
>>>>>>>>>>   Vinay
>>>>>>>>>> 
>>>>>>>>>> On Fri, Sep 27, 2019 at 11:54 AM Vinod Kumar Vavilapalli <
>>>>>>>>>> vino...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Moving the thread to the dev lists.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>>> +Vinod
>>>>>>>>>>> 
>>>>>>>>>>>> On Sep 23, 2019, at 11:43 PM, Vinayakumar B <
>>>>>>>> vinayakum...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks Marton,
>>>>>>>>>>>> 
>>>>>>>>>>>> Current created 'hadoop-thirdparty' repo is empty right
>>> now.
>>>>>>>>>>>> Whether to use that repo  for shaded artifact or not will
>>> be
>>>>>>>>>> monitored in
>>>>>>>>>>>> HADOOP-13363 umbrella jira. Please feel free to join the
>>>>>>> discussion.
>>>>>>>>>>>> 
>>>>>>>>>>>> There is no existing codebase is being moved out of
>> hadoop
>>>>>> repo.
>>>>>>> So
>>>>>>>> I
>>>>>>>>>>> think
>>>>>>>>>>>> right now we are good to go.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Vinay
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Sep 23, 2019 at 11:38 PM Marton Elek <
>>>> e...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am not sure if it's defined when is a vote required.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> https://www.apache.org/foundation/voting.html
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Personally I think it's a big enough change to send a
>>>>>>> notification
>>>>>>>> to
>>>>>>>>>>> the
>>>>>>>>>>>>> dev lists with a 'lazy consensus'  closure
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Marton
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 2019/09/23 17:46:37, Vinayakumar B <
>>>>>> vinayakum...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As discussed in HADOOP-13363, protobuf 3.x jar (and may
>>> be
>>>>>> more
>>>>>>> in
>>>>>>>>>>>>> future)
>>>>>>>>>>>>>> will be kept as a shaded artifact in a separate repo,
>>> which
>>>>>> will
>>>>>>>> be
>>>>>>>>>>>>>> referred as dependency in hadoop modules.  This
>> approach
>>>>>> avoids
>>>>>>>>>> shading
>>>>>>>>>>>>> of
>>>>>>>>>>>>>> every submodule during build.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So question is does any VOTE required before asking to
>>>>>> create a
>>>>>>>> git
>>>>>>>>>>> repo?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On selfserve platform
>>>>>>>> https://gitbox.apache.org/setup/newrepo.html
>>>>>>>>>>>>>> I can access see that, requester should be PMC.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Wanted to confirm here first.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Vinay
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>> 
>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>> private-unsubscr...@hadoop.apache.org
>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>> private-h...@hadoop.apache.org
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 
> -- 
> 
> 
> 
> --Brahma Reddy Battula

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to