Hey Simhadri, thanks for starting this discussion. Maven has many limitations when it comes to publishing multiple artifacts from the same module. In most cases, the end result is broken and hard to use. The pom file that is published for a given module is not able to describe correctly all artifacts of the module and that's why there is one main artifact for every module; dependency declarations are usually correct for the main artifact but are not representative for the rest.
For example, end-users who consume the hive-exec-core module tend to think that maven will automatically resolve all transitive dependencies and things will work as usual which is not the case. In the past, this kind of assumption created a lot of confusion on consumers of the hive-core-exec.jar with tickets and open debates that spanned for multiple months. The discussions even reached a point where people requested certain features of Hive to be reverted in order to rectify some things around transitive dependencies and the core jar. I think we should stick to the usual maven convention and just publish one artifact for each module. Adding back and claiming to support the "core" jar is a step backwards that just postpones the real problems that we need to tackle. Furthermore, I don't think that the hive-exec module was ever meant to be used as a dependency. This is mainly an application module and not a library module and that's why shading takes place. Clearly some parts from hive-exec could be considered to become a library and that would be a promising direction going forward (splitting hive-exec into other modules) but a bit outside the scope of the current discussion. >From the issues outlined above the only actionable item that I see concerns the joda library so we could try to simply relocate it if it is causing issues. Finally, if someone wants to create a jar with specific contents from the hive-exec module it is rather easy to do so. I created a small POC project [1] on how someone can create something similar to the hive-exec-core.jar and incorporate it in their build. Each project has separate needs so for such customization I feel that the burden shouldn't fall on the Hive community. Best, Stamatis [1] https://github.com/zabetak/hive-core-poc On Thu, Apr 25, 2024 at 11:12 AM Simhadri G <simhad...@apache.org> wrote: > > Hi Everyone, > > The hive-exec:core jar is used by spark, oozie, hudi and many other projects. > Removal of the hive-exec:core jar has caused the following issues. > > Spark : https://lists.apache.org/list?dev@hive.apache.org:lte=1M:joda > Oozie: https://lists.apache.org/thread/yld75ltf9y8d9q3cow3xqlg0fqyj6mkg > Hudi: apache/hudi#8147 > Apache IotDB: https://lists.apache.org/thread/wdqsyj89w9cvyk1pyxr83hlxpg6zp1go > Guava: https://github.com/google/guava/issues/6666 > joda-time: https://lists.apache.org/thread/sphgcvod3qx9wtc51ltpfyr8dpx9p294 > > I understand that there is prior discussion about why the hive-exec:core jar > was removed here: > https://lists.apache.org/thread/cwtxnffoqpwgmdtlc9hyor2cm22djpkg > > We agreed that ultimately hive-exec jar should be used over hive-exec:core > but there are quite a few dependencies that need to be shaded and relocated > for this. https://issues.apache.org/jira/browse/HIVE-26220 . > > Until we shade & relocate dependencies in hive-exec, we should restore the > hive-exec:core jar . The intention for this is to provide a smoother > transition from the hive-exec:core to hive-exec jar for projects that depend > on hive . > > Seeking inputs from the community and a way to move forward on this topic. > > I apologize in advance if I have missed anything. > > Thanks! > > Simhadri G