Hi Jingsong,

I think I misunderstood you. So your argument is that, to support hive
1.0.0 - 1.2.2, we are actually using Hive 1.2.2 and thus we name the flink
module as "flink-connector-hive-1.2", right? It makes sense to me now.

+1 for this change.

Cheers,
Bowen

On Thu, Mar 5, 2020 at 6:53 PM Jingsong Li <jingsongl...@gmail.com> wrote:

> Hi Bowen,
>
> My idea is to directly provide the really dependent version, such as hive
> 1.2.2, our jar name is hive 1.2.2, so that users can directly and clearly
> know the version. As for which metastore is supported, we can guide it in
> the document, otherwise, write 1.0, and the result version is indeed 1.2.2,
> which will make users have wrong expectations.
>
> Another, maybe 2.3.6 can support 2.0-2.2 after some efforts.
>
> Best,
> Jingsong Lee
>
> On Fri, Mar 6, 2020 at 1:00 AM Bowen Li <bowenl...@gmail.com> wrote:
>
> > > I have some hesitation, because the actual version number can better
> > reflect the actual dependency. For example, if the user also knows the
> > field hiveVersion[1]. He may enter the wrong hiveVersion because of the
> > name, or he may have the wrong expectation for the hive built-in
> functions.
> >
> > Sorry, I'm not sure if my proposal is understood correctly.
> >
> > What I'm saying is, in your original proposal, taking an example,
> suggested
> > naming the module as "flink-connector-hive-1.2" to support hive 1.0.0 -
> > 1.2.2, a name including the highest Hive version it supports. I'm
> > suggesting to name it "flink-connector-hive-1.0", a name including the
> > lowest Hive version it supports.
> >
> > What do you think?
> >
> >
> >
> > On Wed, Mar 4, 2020 at 11:14 PM Jingsong Li <jingsongl...@gmail.com>
> > wrote:
> >
> > > Hi Bowen, thanks for your reply.
> > >
> > > > will there be a base module like "flink-connector-hive-base" which
> > holds
> > > all the common logic of these proposed modules
> > >
> > > Maybe we don't need, their implementation is only "pom.xml". Different
> > > versions have different dependencies.
> > >
> > > > it's more common to set the version in module name to be the lowest
> > > version that this module supports
> > >
> > > I have some hesitation, because the actual version number can better
> > > reflect the actual dependency. For example, if the user also knows the
> > > field hiveVersion[1]. He may enter the wrong hiveVersion because of the
> > > name, or he may have the wrong expectation for the hive built-in
> > functions.
> > >
> > > [1] https://github.com/apache/flink/pull/11304
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Thu, Mar 5, 2020 at 2:34 PM Bowen Li <bowenl...@gmail.com> wrote:
> > >
> > > > Thanks Jingsong for your explanation! I'm +1 for this initiative.
> > > >
> > > > According to your description, I think it makes sense to incorporate
> > > > support of Hive 2.2 to that of 2.0/2.1 and reducing the number of
> > ranges
> > > to
> > > > 4.
> > > >
> > > > A couple minor followup questions:
> > > > 1) will there be a base module like "flink-connector-hive-base" which
> > > holds
> > > > all the common logic of these proposed modules and is compiled into
> the
> > > > uber jar of "flink-connector-hive-xxx"?
> > > > 2) according to my observation, it's more common to set the version
> in
> > > > module name to be the lowest version that this module supports, e.g.
> > for
> > > > Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0"
> > > > rather than "flink-connector-hive-1.2"
> > > >
> > > >
> > > > On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks Bowen for involving.
> > > > >
> > > > > > why you proposed segregating hive versions into the 5 ranges
> > above? &
> > > > > what different Hive features are supported in the 5 ranges?
> > > > >
> > > > > For only higher client dependencies version support lower hive
> > > metastore
> > > > > versions:
> > > > > - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column
> > stats,
> > > > we
> > > > > can throw exception for the unsupported feature.
> > > > > - Hive 2.0 and Hive 2.1, primary key support and alter_partition
> api
> > > > > change.
> > > > > - Hive 2.2 no thrift change.
> > > > > - Hive 2.3 change many things, lots of thrift change.
> > > > > - Hive 3+, not null. unique, timestamp, so many things.
> > > > >
> > > > > All these things can be found in hive_metastore.thrift.
> > > > >
> > > > > I think I can try do more effort in implementation to use Hive 2.2
> to
> > > > > support Hive 2.0. So the range size will be 4.
> > > > >
> > > > > > have you tested that whether the proposed corresponding Flink
> > module
> > > > will
> > > > > be fully compatible with each Hive version range?
> > > > >
> > > > > Yes, I have done some tests, not really for "fully", but it is a
> > > > technical
> > > > > judgment.
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > > > On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com>
> wrote:
> > > > >
> > > > > > Thanks, Jingsong, for bringing this up. We've received lots of
> > > > feedbacks
> > > > > in
> > > > > > the past few months that the complexity involved in different
> Hive
> > > > > versions
> > > > > > has been quite painful for users to start with. So it's great to
> > step
> > > > > > forward and deal with such issue.
> > > > > >
> > > > > > Before getting on a decision, can you please explain:
> > > > > >
> > > > > > 1) why you proposed segregating hive versions into the 5 ranges
> > > above?
> > > > > > 2) what different Hive features are supported in the 5 ranges?
> > > > > > 3) have you tested that whether the proposed corresponding Flink
> > > module
> > > > > > will be fully compatible with each Hive version range?
> > > > > >
> > > > > > Thanks,
> > > > > > Bowen
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <
> > lzljs3620...@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I'd like to propose introduce flink-connector-hive-xx modules.
> > > > > > >
> > > > > > > We have documented the dependencies detailed information[2].
> But
> > > > still
> > > > > > has
> > > > > > > some inconvenient:
> > > > > > > - Too many versions, users need to pick one version from 8
> > > versions.
> > > > > > > - Too many versions, It's not friendly to our developers
> either,
> > > > > because
> > > > > > > there's a problem/exception, we need to look at eight different
> > > > > versions
> > > > > > of
> > > > > > > hive client code, which are often various.
> > > > > > > - Too many jars, for example, users need to download 4+ jars
> for
> > > Hive
> > > > > 1.x
> > > > > > > from various places.
> > > > > > >
> > > > > > > We have discussed in [1] and [2], but unfortunately, we can not
> > > > achieve
> > > > > > an
> > > > > > > agreement.
> > > > > > >
> > > > > > > For improving this, I'd like to introduce few
> > > flink-connector-hive-xx
> > > > > > > modules in flink-connectors, module contains all the
> dependencies
> > > > > related
> > > > > > > to hive. And only support lower hive metastore versions:
> > > > > > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2
> > > > > > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1
> > > > > > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0
> > > > > > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6
> > > > > > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2
> > > > > > >
> > > > > > > Users can choose one and download to flink/lib. It includes all
> > > hive
> > > > > > > things.
> > > > > > >
> > > > > > > I try to use a single module to deploy multiple versions, but I
> > can
> > > > not
> > > > > > > find a suitable way, because different modules require
> different
> > > > > versions
> > > > > > > and different dependencies.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html
> > > > > > > [2]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong Lee
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best, Jingsong Lee
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

Reply via email to