Hi Jingsong, I think I misunderstood you. So your argument is that, to support hive 1.0.0 - 1.2.2, we are actually using Hive 1.2.2 and thus we name the flink module as "flink-connector-hive-1.2", right? It makes sense to me now.
+1 for this change. Cheers, Bowen On Thu, Mar 5, 2020 at 6:53 PM Jingsong Li <jingsongl...@gmail.com> wrote: > Hi Bowen, > > My idea is to directly provide the really dependent version, such as hive > 1.2.2, our jar name is hive 1.2.2, so that users can directly and clearly > know the version. As for which metastore is supported, we can guide it in > the document, otherwise, write 1.0, and the result version is indeed 1.2.2, > which will make users have wrong expectations. > > Another, maybe 2.3.6 can support 2.0-2.2 after some efforts. > > Best, > Jingsong Lee > > On Fri, Mar 6, 2020 at 1:00 AM Bowen Li <bowenl...@gmail.com> wrote: > > > > I have some hesitation, because the actual version number can better > > reflect the actual dependency. For example, if the user also knows the > > field hiveVersion[1]. He may enter the wrong hiveVersion because of the > > name, or he may have the wrong expectation for the hive built-in > functions. > > > > Sorry, I'm not sure if my proposal is understood correctly. > > > > What I'm saying is, in your original proposal, taking an example, > suggested > > naming the module as "flink-connector-hive-1.2" to support hive 1.0.0 - > > 1.2.2, a name including the highest Hive version it supports. I'm > > suggesting to name it "flink-connector-hive-1.0", a name including the > > lowest Hive version it supports. > > > > What do you think? > > > > > > > > On Wed, Mar 4, 2020 at 11:14 PM Jingsong Li <jingsongl...@gmail.com> > > wrote: > > > > > Hi Bowen, thanks for your reply. > > > > > > > will there be a base module like "flink-connector-hive-base" which > > holds > > > all the common logic of these proposed modules > > > > > > Maybe we don't need, their implementation is only "pom.xml". Different > > > versions have different dependencies. > > > > > > > it's more common to set the version in module name to be the lowest > > > version that this module supports > > > > > > I have some hesitation, because the actual version number can better > > > reflect the actual dependency. For example, if the user also knows the > > > field hiveVersion[1]. He may enter the wrong hiveVersion because of the > > > name, or he may have the wrong expectation for the hive built-in > > functions. > > > > > > [1] https://github.com/apache/flink/pull/11304 > > > > > > Best, > > > Jingsong Lee > > > > > > On Thu, Mar 5, 2020 at 2:34 PM Bowen Li <bowenl...@gmail.com> wrote: > > > > > > > Thanks Jingsong for your explanation! I'm +1 for this initiative. > > > > > > > > According to your description, I think it makes sense to incorporate > > > > support of Hive 2.2 to that of 2.0/2.1 and reducing the number of > > ranges > > > to > > > > 4. > > > > > > > > A couple minor followup questions: > > > > 1) will there be a base module like "flink-connector-hive-base" which > > > holds > > > > all the common logic of these proposed modules and is compiled into > the > > > > uber jar of "flink-connector-hive-xxx"? > > > > 2) according to my observation, it's more common to set the version > in > > > > module name to be the lowest version that this module supports, e.g. > > for > > > > Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0" > > > > rather than "flink-connector-hive-1.2" > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com> > > > > wrote: > > > > > > > > > Thanks Bowen for involving. > > > > > > > > > > > why you proposed segregating hive versions into the 5 ranges > > above? & > > > > > what different Hive features are supported in the 5 ranges? > > > > > > > > > > For only higher client dependencies version support lower hive > > > metastore > > > > > versions: > > > > > - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column > > stats, > > > > we > > > > > can throw exception for the unsupported feature. > > > > > - Hive 2.0 and Hive 2.1, primary key support and alter_partition > api > > > > > change. > > > > > - Hive 2.2 no thrift change. > > > > > - Hive 2.3 change many things, lots of thrift change. > > > > > - Hive 3+, not null. unique, timestamp, so many things. > > > > > > > > > > All these things can be found in hive_metastore.thrift. > > > > > > > > > > I think I can try do more effort in implementation to use Hive 2.2 > to > > > > > support Hive 2.0. So the range size will be 4. > > > > > > > > > > > have you tested that whether the proposed corresponding Flink > > module > > > > will > > > > > be fully compatible with each Hive version range? > > > > > > > > > > Yes, I have done some tests, not really for "fully", but it is a > > > > technical > > > > > judgment. > > > > > > > > > > Best, > > > > > Jingsong Lee > > > > > > > > > > On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> > wrote: > > > > > > > > > > > Thanks, Jingsong, for bringing this up. We've received lots of > > > > feedbacks > > > > > in > > > > > > the past few months that the complexity involved in different > Hive > > > > > versions > > > > > > has been quite painful for users to start with. So it's great to > > step > > > > > > forward and deal with such issue. > > > > > > > > > > > > Before getting on a decision, can you please explain: > > > > > > > > > > > > 1) why you proposed segregating hive versions into the 5 ranges > > > above? > > > > > > 2) what different Hive features are supported in the 5 ranges? > > > > > > 3) have you tested that whether the proposed corresponding Flink > > > module > > > > > > will be fully compatible with each Hive version range? > > > > > > > > > > > > Thanks, > > > > > > Bowen > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee < > > lzljs3620...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > I'd like to propose introduce flink-connector-hive-xx modules. > > > > > > > > > > > > > > We have documented the dependencies detailed information[2]. > But > > > > still > > > > > > has > > > > > > > some inconvenient: > > > > > > > - Too many versions, users need to pick one version from 8 > > > versions. > > > > > > > - Too many versions, It's not friendly to our developers > either, > > > > > because > > > > > > > there's a problem/exception, we need to look at eight different > > > > > versions > > > > > > of > > > > > > > hive client code, which are often various. > > > > > > > - Too many jars, for example, users need to download 4+ jars > for > > > Hive > > > > > 1.x > > > > > > > from various places. > > > > > > > > > > > > > > We have discussed in [1] and [2], but unfortunately, we can not > > > > achieve > > > > > > an > > > > > > > agreement. > > > > > > > > > > > > > > For improving this, I'd like to introduce few > > > flink-connector-hive-xx > > > > > > > modules in flink-connectors, module contains all the > dependencies > > > > > related > > > > > > > to hive. And only support lower hive metastore versions: > > > > > > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2 > > > > > > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1 > > > > > > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0 > > > > > > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6 > > > > > > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2 > > > > > > > > > > > > > > Users can choose one and download to flink/lib. It includes all > > > hive > > > > > > > things. > > > > > > > > > > > > > > I try to use a single module to deploy multiple versions, but I > > can > > > > not > > > > > > > find a suitable way, because different modules require > different > > > > > versions > > > > > > > and different dependencies. > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html > > > > > > > > > > > > > > Best, > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > > > -- > Best, Jingsong Lee >