Hey all, We already have nightly builds for Hive [1].
Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <vihan...@apache.org> wrote: > > I think there are many benefits like others in this thread suggested which > can be built on top of nightly builds. Having docker images is great but > for now I think we can start simple and publish the jars. Many users still > just deploy using jars and it would be useful to them. Once we have a > docker environment we can add a docker image too to the nightly builds so > that users can choose their preferred way. > > On Mon, May 22, 2023 at 11:07 PM Sungwoo Park <glap...@gmail.com> wrote: > > > I think such nightly builds will be useful for testing and debugging in the > > future. > > > > I also wonder if we can somehow create builds even from previous commits > > (e.g., for the past few years). Such builds from previous commits don't > > have to be daily builds, and I think weekly builds (or even monthly builds) > > would also be very useful. > > > > The reason I wish such builds were available is to facilitate debugging and > > testing. When tested against the TPC-DS benchmark, the current master > > branch has several correctness problems that were introduced after the > > release of Hive 3.1.2. We have reported all problems known to us in [1] and > > also submitted several patches. If such nightly builds had been available, > > we would have saved quite a bit of time for implementing the patches by > > quickly finding offending commits that introduced new correctness bugs. > > > > In addition, you can find quite a few commits in the master branch that > > report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, > > HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > > HIVE-22227, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > > HIVE-25170, HIVE-25864, HIVE-26671. > > (There may be some errors in this list because we compared against Hive > > 3.1.2 with many patches backported.) Such nightly builds can be useful for > > finding root causes of such bugs. > > > > Ideally I wish there was an automated procedure to create nightly builds, > > run TPC-DS benchmark, and report correctness/performance results, although > > this would be quite hard to implement. (I remember Spark implemented this > > procedure in the era of Spark 2, but my memory could be wrong.) > > > > [1] https://issues.apache.org/jira/browse/HIVE-26654 > > > > > > On Tue, May 23, 2023 at 10:44 AM Ayush Saxena <ayush...@gmail.com> wrote: > > > > > Hi Vihang, > > > +1, We were even exploring publishing the docker images of the snapshot > > > version as well per commit or maybe weekly, so just shoot 2 docker > > commands > > > and you get a Hive cluster running with master code. > > > > > > Sai, I think to spin up an env via Docker with all these things should be > > > doable for sure, but would require someone with real good expertise with > > > docker as well as setting up these services with Hive. Obviously, I am > > not > > > that guy :-) > > > > > > @Simhadri has a PR which publishes docker images once a release tag is > > > pushed, you can explore to have similar stuff for the Snapshot version, > > > maybe if that sounds cool > > > > > > -Ayush > > > > > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > > > <saihema...@cloudera.com.invalid> wrote: > > > > > > > Hi Vihang, > > > > > > > > +1 on the idea. > > > > > > > > This is a great idea to quickly test if a certain feature is working as > > > > expected on a certain branch. > > > > This way we test data loss, correctness, or any other unexpected > > > scenarios > > > > that are Hive specific only. However, I'm wondering if it is possible > > to > > > > deploy/test in a kerberized environment or issues involving > > authorization > > > > services like sentry/ranger. > > > > > > > > Thanks, > > > > Sai. > > > > > > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > > > vihan...@apache.org> > > > > wrote: > > > > > > > > > Hello Team, > > > > > > > > > > I have observed that it is a common use-case where users would like > > to > > > > test > > > > > out unreleased features/bug fixes either to unblock them or test out > > if > > > > the > > > > > bug fixes really work as intended in their environments. Today in the > > > > case > > > > > of Apache Hive, this is not very user friendly because it requires > > the > > > > end > > > > > user to build the binaries directly from the hive source code. > > > > > > > > > > I found that Apache Spark has a very useful infrastructure [1] which > > > > > deploys nightly snapshots [2] [3] from the branch using github > > actions. > > > > > This is super useful for any user who wants to try out the latest and > > > > > greatest using the nightly builds. > > > > > > > > > > I was wondering if we should also adopt this. We can use github > > actions > > > > to > > > > > upload the snapshot jars to the public repository (e.g github > > packages) > > > > and > > > > > schedule it as a nightly job. > > > > > > > > > > [1] https://issues.apache.org/jira/browse/INFRA-21167 > > > > > [2] > > > https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > > > > > [3] https://github.com/apache/spark/pull/30623 > > > > > > > > > > I can take a stab at this if the community thinks that this is a nice > > > > thing > > > > > to have. > > > > > > > > > > Thanks, > > > > > Vihang > > > > > > > > > > > > > >