Hey all,

We already have nightly builds for Hive [1].

Do we need something more than that?

Best,
Stamatis

[1] http://ci.hive.apache.org/job/hive-nightly/


On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <vihan...@apache.org> wrote:
>
> I think there are many benefits like others in this thread suggested which
> can be built on top of nightly builds. Having docker images is great but
> for now I think we can start simple and publish the jars. Many users still
> just deploy using jars and it would be useful to them. Once we have a
> docker environment we can add a docker image too to the nightly builds so
> that users can choose their preferred way.
>
> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park <glap...@gmail.com> wrote:
>
> > I think such nightly builds will be useful for testing and debugging in the
> > future.
> >
> > I also wonder if we can somehow create builds even from previous commits
> > (e.g., for the past few years). Such builds from previous commits don't
> > have to be daily builds, and I think weekly builds (or even monthly builds)
> > would also be very useful.
> >
> > The reason I wish such builds were available is to facilitate debugging and
> > testing. When tested against the TPC-DS benchmark, the current master
> > branch has several correctness problems that were introduced after the
> > release of Hive 3.1.2. We have reported all problems known to us in [1] and
> > also submitted several patches. If such nightly builds had been available,
> > we would have saved quite a bit of time for implementing the patches by
> > quickly finding offending commits that introduced new correctness bugs.
> >
> > In addition, you can find quite a few commits in the master branch that
> > report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990,
> > HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114,
> > HIVE-22227, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777,
> > HIVE-25170, HIVE-25864, HIVE-26671.
> > (There may be some errors in this list because we compared against Hive
> > 3.1.2 with many patches backported.) Such nightly builds can be useful for
> > finding root causes of such bugs.
> >
> > Ideally I wish there was an automated procedure to create nightly builds,
> > run TPC-DS benchmark, and report correctness/performance results, although
> > this would be quite hard to implement. (I remember Spark implemented this
> > procedure in the era of Spark 2, but my memory could be wrong.)
> >
> > [1] https://issues.apache.org/jira/browse/HIVE-26654
> >
> >
> > On Tue, May 23, 2023 at 10:44 AM Ayush Saxena <ayush...@gmail.com> wrote:
> >
> > > Hi Vihang,
> > > +1, We were even exploring publishing the docker images of the snapshot
> > > version as well per commit or maybe weekly, so just shoot 2 docker
> > commands
> > > and you get a Hive cluster running with master code.
> > >
> > > Sai, I think to spin up an env via Docker with all these things should be
> > > doable for sure, but would require someone with real good expertise with
> > > docker as well as setting up these services with Hive. Obviously, I am
> > not
> > > that guy :-)
> > >
> > > @Simhadri has a PR which publishes docker images once a release tag is
> > > pushed, you can explore to have similar stuff for the Snapshot version,
> > > maybe if that sounds cool
> > >
> > > -Ayush
> > >
> > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala
> > > <saihema...@cloudera.com.invalid> wrote:
> > >
> > > > Hi Vihang,
> > > >
> > > > +1 on the idea.
> > > >
> > > > This is a great idea to quickly test if a certain feature is working as
> > > > expected on a certain branch.
> > > > This way we test data loss, correctness, or any other unexpected
> > > scenarios
> > > > that are Hive specific only. However, I'm wondering if it is possible
> > to
> > > > deploy/test in a kerberized environment or issues involving
> > authorization
> > > > services like sentry/ranger.
> > > >
> > > > Thanks,
> > > > Sai.
> > > >
> > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar <
> > > vihan...@apache.org>
> > > > wrote:
> > > >
> > > > > Hello Team,
> > > > >
> > > > > I have observed that it is a common use-case where users would like
> > to
> > > > test
> > > > > out unreleased features/bug fixes either to unblock them or test out
> > if
> > > > the
> > > > > bug fixes really work as intended in their environments. Today in the
> > > > case
> > > > > of Apache Hive, this is not very user friendly because it requires
> > the
> > > > end
> > > > > user to build the binaries directly from the hive source code.
> > > > >
> > > > > I found that Apache Spark has a very useful infrastructure [1] which
> > > > > deploys nightly snapshots [2] [3] from the branch using github
> > actions.
> > > > > This is super useful for any user who wants to try out the latest and
> > > > > greatest using the nightly builds.
> > > > >
> > > > > I was wondering if we should also adopt this. We can use github
> > actions
> > > > to
> > > > > upload the snapshot jars to the public repository (e.g github
> > packages)
> > > > and
> > > > > schedule it as a nightly job.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/INFRA-21167
> > > > > [2]
> > > https://github.com/apache/spark/pkgs/container/apache-spark-ci-image
> > > > > [3] https://github.com/apache/spark/pull/30623
> > > > >
> > > > > I can take a stab at this if the community thinks that this is a nice
> > > > thing
> > > > > to have.
> > > > >
> > > > > Thanks,
> > > > > Vihang
> > > > >
> > > >
> > >
> >

Reply via email to