Hello Hive users, After attending the Hive meetup yesterday (huge thanks to the organizers!), I thought that perhaps many organizations were maintaining their own Hive 2 and 3 branches by backporting important patches to vanilla Hive. Ideally it would be great if all the important patches were regularly merged to Hive 2 and 3 branches (e.g., branch-2.3 and branch-3.1), but I guess this would take a lot of time and effort on the Hive committer side, and it also seems like at the moment, most of the efforts are directed at the master branch.
I find this process of backporting patches to Hive 2 and 3 branches to be quite a challenge and time-consuming, especially to those "outsiders" who have not implemented/reviewed the patches. The problem is two-fold: 1) you have to decide what patches to apply and in what order; 2) you have to run all the tests to make sure that new patches are compatible with the code base and do not introduce new bugs. 1) is not easy because sometimes a patch from the master branch fails to merge because of missing dependencies. In such a case, you have to go back to the history of commits, identify those dependency commits, and merge them first. Depending on the level of changes made in the patch, this can be a big pain. 2) can be also a problem if applying a new patch produces different test results. Sometimes a patch is merged with no conflicts, but some tests fail. Besides it may take a lot of time to run tests themselves. So, I wonder if anyone could share their experience and wisdom on how to maintain Hive 2 and 3 branches, or share their git repos. For us, we have applied about 210 patches to Hive 3.1.3 (since Nov 2, 2020), and are in the middle of applying additional 100+ patches. You can find our work at the following repo. (You can ignore the last commit which is internal to our work.) https://github.com/mr3project/hive-mr3/commits/master3 Thanks, --- Sungwoo Park