I'm also -1 on separate builds. What about publishing convenience jars that contain the dependencies for each version? For example, there could be a flink-hive-1.2.1-uber.jar that users could just add to their lib folder that contains all the necessary dependencies to connect to that hive version.
On Fri, Dec 13, 2019 at 8:50 AM Robert Metzger <rmetz...@apache.org> wrote: > I'm generally not opposed to convenience binaries, if a huge number of > people would benefit from them, and the overhead for the Flink project is > low. I did not see a huge demand for such binaries yet (neither for the > Flink + Hive integration). Looking at Apache Spark, they are also only > offering convenience binaries for Hadoop only. > > Maybe we could provide a "Docker Playground" for Flink + Hive in the > documentation (and the flink-playgrounds.git repo)? > (similar to > > https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink-operations-playground.html > ) > > > > On Fri, Dec 13, 2019 at 3:04 PM Chesnay Schepler <ches...@apache.org> > wrote: > > > -1 > > > > We shouldn't need to deploy additional binaries to have a feature be > > remotely usable. > > This usually points to something else being done incorrectly. > > > > If it is indeed such a hassle to setup hive on Flink, then my conclusion > > would be that either > > a) the documentation needs to be improved > > b) the architecture needs to be improved > > or, if all else fails c) provide a utility script for setting it up > easier. > > > > We spent a lot of time on reducing the number of binaries in the hadoop > > days, and also go extra steps to prevent a separate Java 11 binary, and > > I see no reason why Hive should get special treatment on this matter. > > > > Regards, > > Chesnay > > > > On 13/12/2019 09:44, Bowen Li wrote: > > > Hi all, > > > > > > I want to propose to have a couple separate Flink distributions with > Hive > > > dependencies on specific Hive versions (2.3.4 and 1.2.1). The > > distributions > > > will be provided to users on Flink download page [1]. > > > > > > A few reasons to do this: > > > > > > 1) Flink-Hive integration is important to many many Flink and Hive > users > > in > > > two dimensions: > > > a) for Flink metadata: HiveCatalog is the only persistent catalog > > to > > > manage Flink tables. With Flink 1.10 supporting more DDL, the > persistent > > > catalog would be playing even more critical role in users' workflow > > > b) for Flink data: Hive data connector (source/sink) helps both > > Flink > > > and Hive users to unlock new use cases in streaming, > > near-realtime/realtime > > > data warehouse, backfill, etc. > > > > > > 2) currently users have to go thru a *really* tedious process to get > > > started, because it requires lots of extra jars (see [2]) that are > absent > > > in Flink's lean distribution. We've had so many users from public > mailing > > > list, private email, DingTalk groups who got frustrated on spending > lots > > of > > > time figuring out the jars themselves. They would rather have a more > > "right > > > out of box" quickstart experience, and play with the catalog and > > > source/sink without hassle. > > > > > > 3) it's easier for users to replace those Hive dependencies for their > own > > > Hive versions - just replace those jars with the right versions and no > > need > > > to find the doc. > > > > > > * Hive 2.3.4 and 1.2.1 are two versions that represent lots of user > base > > > out there, and that's why we are using them as examples for > dependencies > > in > > > [1] even though we've supported almost all Hive versions [3] now. > > > > > > I want to hear what the community think about this, and how to achieve > it > > > if we believe that's the way to go. > > > > > > Cheers, > > > Bowen > > > > > > [1] https://flink.apache.org/downloads.html > > > [2] > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#dependencies > > > [3] > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#supported-hive-versions > > > > > > > >