I just gone through previous discussions from jira (HADOOP-13363) and this thread,As stack and Duo Zhang mentioned ,this artifact(instead of thirdparty we can give shaded??) will be voted by PMC like below, won’t it be fair??
https://lists.apache.org/thread.html/f12c589baabbc79c7fb2843422d4590bea982cd102e2bd9d21e9884b@%3Cdev.hbase.apache.org%3E one thought here: May be we can unify ( we can incubation project for same ??) ? So, that all projects can use same git repo for shaded artifacts?? Wanted to join for the discussion, so please let me know.. On Sun, 5 Jan 2020 at 7:33 AM, Sree Vaddi <sree_at_ch...@yahoo.com.invalid> wrote: > apache/hadoop-thirdparty, How would it fit into ASF ? As an Incubating > Project ? Or as a TLP ? > Or as a new project definition ? > > The effort to streamline and put in an accepted standard for the > dependencies that require shading,seems beyond the siloed efforts of > hadoop, hbase, etc.... > > I propose, we bring all the decision makers from all these artifacts in > one room and decide best course of action.I am looking at, no projects > should ever had to shade any artifacts except as an absolute necessary > alternative. > > > Thank you./Sree > > > > On Saturday, January 4, 2020, 7:49:18 AM PST, Vinayakumar B < > vinayakum...@apache.org> wrote: > > Hi, > Sorry for the late reply,. > >>> To be exact, how can we better use the thirdparty repo? Looking at > HBase as an example, it looks like everything that are known to break a lot > after an update get shaded into the hbase-thirdparty artifact: guava, > netty, ... etc. > Is it the purpose to isolate these naughty dependencies? > Yes, shading is to isolate these naughty dependencies from downstream > classpath and have independent control on these upgrades without breaking > downstreams. > > First PR https://github.com/apache/hadoop-thirdparty/pull/1 to create the > protobuf shaded jar is ready to merge. > > Please take a look if anyone interested, will be merged may be after two > days if no objections. > > -Vinay > > > On Thu, Oct 10, 2019 at 3:30 AM Wei-Chiu Chuang <weic...@apache.org> > wrote: > > > Hi I am late to this but I am keen to understand more. > > > > To be exact, how can we better use the thirdparty repo? Looking at HBase > > as an example, it looks like everything that are known to break a lot > after > > an update get shaded into the hbase-thirdparty artifact: guava, netty, > ... > > etc. > > Is it the purpose to isolate these naughty dependencies? > > > > On Wed, Oct 9, 2019 at 12:38 PM Vinayakumar B <vinayakum...@apache.org> > > wrote: > > > >> Hi All, > >> > >> I have updated the PR as per @Owen O'Malley <owen.omal...@gmail.com> > >> 's suggestions. > >> > >> i. Renamed the module to 'hadoop-shaded-protobuf37' > >> ii. Kept the shaded package to 'o.a.h.thirdparty.protobuf37' > >> > >> Please review!! > >> > >> Thanks, > >> -Vinay > >> > >> > >> On Sat, Sep 28, 2019 at 10:29 AM 张铎(Duo Zhang) <palomino...@gmail.com> > >> wrote: > >> > >> > For HBase we have a separated repo for hbase-thirdparty > >> > > >> > https://github.com/apache/hbase-thirdparty > >> > > >> > We will publish the artifacts to nexus so we do not need to include > >> > binaries in our git repo, just add a dependency in the pom. > >> > > >> > > >> > > >> > https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-protobuf > >> > > >> > > >> > And it has its own release cycles, only when there are special > >> requirements > >> > or we want to upgrade some of the dependencies. This is the vote > thread > >> for > >> > the newest release, where we want to provide a shaded gson for jdk7. > >> > > >> > > >> > > >> > https://lists.apache.org/thread.html/f12c589baabbc79c7fb2843422d4590bea982cd102e2bd9d21e9884b@%3Cdev.hbase.apache.org%3E > >> > > >> > > >> > Thanks. > >> > > >> > Vinayakumar B <vinayakum...@apache.org> 于2019年9月28日周六 上午1:28写道: > >> > > >> > > Please find replies inline. > >> > > > >> > > -Vinay > >> > > > >> > > On Fri, Sep 27, 2019 at 10:21 PM Owen O'Malley < > >> owen.omal...@gmail.com> > >> > > wrote: > >> > > > >> > > > I'm very unhappy with this direction. In particular, I don't think > >> git > >> > is > >> > > > a good place for distribution of binary artifacts. Furthermore, > the > >> PMC > >> > > > shouldn't be releasing anything without a release vote. > >> > > > > >> > > > > >> > > Proposed solution doesnt release any binaries in git. Its actually a > >> > > complete sub-project which follows entire release process, including > >> VOTE > >> > > in public. I have mentioned already that release process is similar > to > >> > > hadoop. > >> > > To be specific, using the (almost) same script used in hadoop to > >> generate > >> > > artifacts, sign and deploy to staging repository. Please let me know > >> If I > >> > > am conveying anything wrong. > >> > > > >> > > > >> > > > I'd propose that we make a third party module that contains the > >> > *source* > >> > > > of the pom files to build the relocated jars. This should > >> absolutely be > >> > > > treated as a last resort for the mostly Google projects that > >> regularly > >> > > > break binary compatibility (eg. Protobuf & Guava). > >> > > > > >> > > > > >> > > Same has been implemented in the PR > >> > > https://github.com/apache/hadoop-thirdparty/pull/1. Please check > and > >> let > >> > > me > >> > > know If I misunderstood. Yes, this is the last option we have AFAIK. > >> > > > >> > > > >> > > > In terms of naming, I'd propose something like: > >> > > > > >> > > > org.apache.hadoop.thirdparty.protobuf2_5 > >> > > > org.apache.hadoop.thirdparty.guava28 > >> > > > > >> > > > In particular, I think we absolutely need to include the version > of > >> the > >> > > > underlying project. On the other hand, since we should not be > >> shading > >> > > > *everything* we can drop the leading com.google. > >> > > > > >> > > > > >> > > IMO, This naming convention is easy for identifying the underlying > >> > project, > >> > > but it will be difficult to maintain going forward if underlying > >> project > >> > > versions changes. Since thirdparty module have its own releases, > each > >> of > >> > > those release can be mapped to specific version of underlying > project. > >> > Even > >> > > the binary artifact can include a MANIFEST with underlying project > >> > details > >> > > as per Steve's suggestion on HADOOP-13363. > >> > > That said, if you still prefer to have project number in artifact > id, > >> it > >> > > can be done. > >> > > > >> > > The Hadoop project can make releases of the thirdparty module: > >> > > > > >> > > > <dependency> > >> > > > <groupId>org.apache.hadoop</groupId> > >> > > > <artifactId>hadoop-thirdparty-protobuf25</artifactId> > >> > > > <version>1.0</version> > >> > > > </dependency> > >> > > > > >> > > > > >> > > Note that the version has to be the hadoop thirdparty release > number, > >> > which > >> > > > is part of why you need to have the underlying version in the > >> artifact > >> > > > name. These we can push to maven central as new releases from > >> Hadoop. > >> > > > > >> > > > > >> > > Exactly, same has been implemented in the PR. hadoop-thirdparty > module > >> > have > >> > > its own releases. But in HADOOP Jira, thirdparty versions can be > >> > > differentiated using prefix "thirdparty-". > >> > > > >> > > Same solution is being followed in HBase. May be people involved in > >> HBase > >> > > can add some points here. > >> > > > >> > > Thoughts? > >> > > > > >> > > > .. Owen > >> > > > > >> > > > On Fri, Sep 27, 2019 at 8:38 AM Vinayakumar B < > >> vinayakum...@apache.org > >> > > > >> > > > wrote: > >> > > > > >> > > >> Hi All, > >> > > >> > >> > > >> I wanted to discuss about the separate repo for thirdparty > >> > > dependencies > >> > > >> which we need to shaded and include in Hadoop component's jars. > >> > > >> > >> > > >> Apologies for the big text ahead, but this needs clear > >> > explanation!! > >> > > >> > >> > > >> Right now most needed such dependency is protobuf. Protobuf > >> > > dependency > >> > > >> was not upgraded from 2.5.0 onwards with the fear that downstream > >> > > builds, > >> > > >> which depends on transitive dependency protobuf coming from > >> hadoop's > >> > > jars, > >> > > >> may fail with the upgrade. Apparently protobuf does not guarantee > >> > source > >> > > >> compatibility, though it guarantees wire compatibility between > >> > versions. > >> > > >> Because of this behavior, version upgrade may cause breakage in > >> known > >> > > and > >> > > >> unknown (private?) downstreams. > >> > > >> > >> > > >> So to tackle this, we came up the following proposal in > >> > HADOOP-13363. > >> > > >> > >> > > >> Luckily, As far as I know, no APIs, either public to user or > >> > between > >> > > >> Hadoop processes, is not directly using protobuf classes in > >> > signatures. > >> > > >> (If > >> > > >> any exist, please let us know). > >> > > >> > >> > > >> Proposal: > >> > > >> ------------ > >> > > >> > >> > > >> 1. Create a artifact(s) which contains shaded dependencies. > All > >> > such > >> > > >> shading/relocation will be with known prefix > >> > > >> **org.apache.hadoop.thirdparty.**. > >> > > >> 2. Right now protobuf jar (ex: > >> > > o.a.h.thirdparty:hadoop-shaded-protobuf) > >> > > >> to start with, all **com.google.protobuf** classes will be > >> relocated > >> > as > >> > > >> **org.apache.hadoop.thirdparty.com.google.protobuf**. > >> > > >> 3. Hadoop modules, which needs protobuf as dependency, will > add > >> > this > >> > > >> shaded artifact as dependency (ex: > >> > > >> o.a.h.thirdparty:hadoop-shaded-protobuf). > >> > > >> 4. All previous usages of "com.google.protobuf" will be > >> relocated > >> > to > >> > > >> "org.apache.hadoop.thirdparty.com.google.protobuf" in the code > and > >> > will > >> > > be > >> > > >> committed. Please note, this replacement is One-Time directly in > >> > source > >> > > >> code, NOT during compile and package. > >> > > >> 5. Once all usages of "com.google.protobuf" is relocated, then > >> > hadoop > >> > > >> dont care about which version of original "protobuf-java" is in > >> > > >> dependency. > >> > > >> 6. Just keep "protobuf-java:2.5.0" in dependency tree not to > >> break > >> > > the > >> > > >> downstreams. But hadoop will be originally using the latest > >> protobuf > >> > > >> present in "o.a.h.thirdparty:hadoop-shaded-protobuf". > >> > > >> > >> > > >> 7. Coming back to separate repo, Following are most > appropriate > >> > > reasons > >> > > >> of keeping shaded dependency artifact in separate repo instead of > >> > > >> submodule. > >> > > >> > >> > > >> 7a. These artifacts need not be built all the time. It needs > >> to > >> > be > >> > > >> built only when there is a change in the dependency version or > the > >> > build > >> > > >> process. > >> > > >> 7b. If added as "submodule in Hadoop repo", > >> > > maven-shade-plugin:shade > >> > > >> will execute only in package phase. That means, "mvn compile" or > >> "mvn > >> > > >> test-compile" will not be failed as this artifact will not have > >> > > relocated > >> > > >> classes, instead it will have original classes, resulting in > >> > compilation > >> > > >> failure. Workaround, build thirdparty submodule first and exclude > >> > > >> "thirdparty" submodule in other executions. This will be a > complex > >> > > process > >> > > >> compared to keeping in a separate repo. > >> > > >> > >> > > >> 7c. Separate repo, will be a subproject of Hadoop, using the > >> > same > >> > > >> HADOOP jira project, with different versioning prefixed with > >> > > "thirdparty-" > >> > > >> (ex: thirdparty-1.0.0). > >> > > >> 7d. Separate will have same release process as Hadoop. > >> > > >> > >> > > >> HADOOP-13363 ( > >> https://issues.apache.org/jira/browse/HADOOP-13363) > >> > > is > >> > > >> an > >> > > >> umbrella jira tracking the changes to protobuf upgrade. > >> > > >> > >> > > >> PR (https://github.com/apache/hadoop-thirdparty/pull/1) has > >> been > >> > > >> raised > >> > > >> for separate repo creation in (HADOOP-16595 ( > >> > > >> https://issues.apache.org/jira/browse/HADOOP-16595) > >> > > >> > >> > > >> Please provide your inputs for the proposal and review the PR > >> to > >> > > >> proceed with the proposal. > >> > > >> > >> > > >> > >> > > > -Thanks, > >> > > >> Vinay > >> > > >> > >> > > >> On Fri, Sep 27, 2019 at 11:54 AM Vinod Kumar Vavilapalli < > >> > > >> vino...@apache.org> > >> > > >> wrote: > >> > > >> > >> > > >> > Moving the thread to the dev lists. > >> > > >> > > >> > > >> > Thanks > >> > > >> > +Vinod > >> > > >> > > >> > > >> > > On Sep 23, 2019, at 11:43 PM, Vinayakumar B < > >> > > vinayakum...@apache.org> > >> > > >> > wrote: > >> > > >> > > > >> > > >> > > Thanks Marton, > >> > > >> > > > >> > > >> > > Current created 'hadoop-thirdparty' repo is empty right now. > >> > > >> > > Whether to use that repo for shaded artifact or not will be > >> > > >> monitored in > >> > > >> > > HADOOP-13363 umbrella jira. Please feel free to join the > >> > discussion. > >> > > >> > > > >> > > >> > > There is no existing codebase is being moved out of hadoop > >> repo. > >> > So > >> > > I > >> > > >> > think > >> > > >> > > right now we are good to go. > >> > > >> > > > >> > > >> > > -Vinay > >> > > >> > > > >> > > >> > > On Mon, Sep 23, 2019 at 11:38 PM Marton Elek < > e...@apache.org> > >> > > wrote: > >> > > >> > > > >> > > >> > >> > >> > > >> > >> I am not sure if it's defined when is a vote required. > >> > > >> > >> > >> > > >> > >> https://www.apache.org/foundation/voting.html > >> > > >> > >> > >> > > >> > >> Personally I think it's a big enough change to send a > >> > notification > >> > > to > >> > > >> > the > >> > > >> > >> dev lists with a 'lazy consensus' closure > >> > > >> > >> > >> > > >> > >> Marton > >> > > >> > >> > >> > > >> > >> On 2019/09/23 17:46:37, Vinayakumar B < > >> vinayakum...@apache.org> > >> > > >> wrote: > >> > > >> > >>> Hi, > >> > > >> > >>> > >> > > >> > >>> As discussed in HADOOP-13363, protobuf 3.x jar (and may be > >> more > >> > in > >> > > >> > >> future) > >> > > >> > >>> will be kept as a shaded artifact in a separate repo, which > >> will > >> > > be > >> > > >> > >>> referred as dependency in hadoop modules. This approach > >> avoids > >> > > >> shading > >> > > >> > >> of > >> > > >> > >>> every submodule during build. > >> > > >> > >>> > >> > > >> > >>> So question is does any VOTE required before asking to > >> create a > >> > > git > >> > > >> > repo? > >> > > >> > >>> > >> > > >> > >>> On selfserve platform > >> > > https://gitbox.apache.org/setup/newrepo.html > >> > > >> > >>> I can access see that, requester should be PMC. > >> > > >> > >>> > >> > > >> > >>> Wanted to confirm here first. > >> > > >> > >>> > >> > > >> > >>> -Vinay > >> > > >> > >>> > >> > > >> > >> > >> > > >> > >> > >> > > > --------------------------------------------------------------------- > >> > > >> > >> To unsubscribe, e-mail: > private-unsubscr...@hadoop.apache.org > >> > > >> > >> For additional commands, e-mail: > >> private-h...@hadoop.apache.org > >> > > >> > >> > >> > > >> > >> > >> > > >> > > >> > > >> > > >> > > >> > >> > > > > >> > > > >> > > >> > > -- --Brahma Reddy Battula