Hi Bowen,

My idea is to directly provide the really dependent version, such as hive
1.2.2, our jar name is hive 1.2.2, so that users can directly and clearly
know the version. As for which metastore is supported, we can guide it in
the document, otherwise, write 1.0, and the result version is indeed 1.2.2,
which will make users have wrong expectations.

Another, maybe 2.3.6 can support 2.0-2.2 after some efforts.

Best,
Jingsong Lee

On Fri, Mar 6, 2020 at 1:00 AM Bowen Li <bowenl...@gmail.com> wrote:

> > I have some hesitation, because the actual version number can better
> reflect the actual dependency. For example, if the user also knows the
> field hiveVersion[1]. He may enter the wrong hiveVersion because of the
> name, or he may have the wrong expectation for the hive built-in functions.
>
> Sorry, I'm not sure if my proposal is understood correctly.
>
> What I'm saying is, in your original proposal, taking an example, suggested
> naming the module as "flink-connector-hive-1.2" to support hive 1.0.0 -
> 1.2.2, a name including the highest Hive version it supports. I'm
> suggesting to name it "flink-connector-hive-1.0", a name including the
> lowest Hive version it supports.
>
> What do you think?
>
>
>
> On Wed, Mar 4, 2020 at 11:14 PM Jingsong Li <jingsongl...@gmail.com>
> wrote:
>
> > Hi Bowen, thanks for your reply.
> >
> > > will there be a base module like "flink-connector-hive-base" which
> holds
> > all the common logic of these proposed modules
> >
> > Maybe we don't need, their implementation is only "pom.xml". Different
> > versions have different dependencies.
> >
> > > it's more common to set the version in module name to be the lowest
> > version that this module supports
> >
> > I have some hesitation, because the actual version number can better
> > reflect the actual dependency. For example, if the user also knows the
> > field hiveVersion[1]. He may enter the wrong hiveVersion because of the
> > name, or he may have the wrong expectation for the hive built-in
> functions.
> >
> > [1] https://github.com/apache/flink/pull/11304
> >
> > Best,
> > Jingsong Lee
> >
> > On Thu, Mar 5, 2020 at 2:34 PM Bowen Li <bowenl...@gmail.com> wrote:
> >
> > > Thanks Jingsong for your explanation! I'm +1 for this initiative.
> > >
> > > According to your description, I think it makes sense to incorporate
> > > support of Hive 2.2 to that of 2.0/2.1 and reducing the number of
> ranges
> > to
> > > 4.
> > >
> > > A couple minor followup questions:
> > > 1) will there be a base module like "flink-connector-hive-base" which
> > holds
> > > all the common logic of these proposed modules and is compiled into the
> > > uber jar of "flink-connector-hive-xxx"?
> > > 2) according to my observation, it's more common to set the version in
> > > module name to be the lowest version that this module supports, e.g.
> for
> > > Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0"
> > > rather than "flink-connector-hive-1.2"
> > >
> > >
> > > On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Bowen for involving.
> > > >
> > > > > why you proposed segregating hive versions into the 5 ranges
> above? &
> > > > what different Hive features are supported in the 5 ranges?
> > > >
> > > > For only higher client dependencies version support lower hive
> > metastore
> > > > versions:
> > > > - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column
> stats,
> > > we
> > > > can throw exception for the unsupported feature.
> > > > - Hive 2.0 and Hive 2.1, primary key support and alter_partition api
> > > > change.
> > > > - Hive 2.2 no thrift change.
> > > > - Hive 2.3 change many things, lots of thrift change.
> > > > - Hive 3+, not null. unique, timestamp, so many things.
> > > >
> > > > All these things can be found in hive_metastore.thrift.
> > > >
> > > > I think I can try do more effort in implementation to use Hive 2.2 to
> > > > support Hive 2.0. So the range size will be 4.
> > > >
> > > > > have you tested that whether the proposed corresponding Flink
> module
> > > will
> > > > be fully compatible with each Hive version range?
> > > >
> > > > Yes, I have done some tests, not really for "fully", but it is a
> > > technical
> > > > judgment.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote:
> > > >
> > > > > Thanks, Jingsong, for bringing this up. We've received lots of
> > > feedbacks
> > > > in
> > > > > the past few months that the complexity involved in different Hive
> > > > versions
> > > > > has been quite painful for users to start with. So it's great to
> step
> > > > > forward and deal with such issue.
> > > > >
> > > > > Before getting on a decision, can you please explain:
> > > > >
> > > > > 1) why you proposed segregating hive versions into the 5 ranges
> > above?
> > > > > 2) what different Hive features are supported in the 5 ranges?
> > > > > 3) have you tested that whether the proposed corresponding Flink
> > module
> > > > > will be fully compatible with each Hive version range?
> > > > >
> > > > > Thanks,
> > > > > Bowen
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <
> lzljs3620...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to propose introduce flink-connector-hive-xx modules.
> > > > > >
> > > > > > We have documented the dependencies detailed information[2]. But
> > > still
> > > > > has
> > > > > > some inconvenient:
> > > > > > - Too many versions, users need to pick one version from 8
> > versions.
> > > > > > - Too many versions, It's not friendly to our developers either,
> > > > because
> > > > > > there's a problem/exception, we need to look at eight different
> > > > versions
> > > > > of
> > > > > > hive client code, which are often various.
> > > > > > - Too many jars, for example, users need to download 4+ jars for
> > Hive
> > > > 1.x
> > > > > > from various places.
> > > > > >
> > > > > > We have discussed in [1] and [2], but unfortunately, we can not
> > > achieve
> > > > > an
> > > > > > agreement.
> > > > > >
> > > > > > For improving this, I'd like to introduce few
> > flink-connector-hive-xx
> > > > > > modules in flink-connectors, module contains all the dependencies
> > > > related
> > > > > > to hive. And only support lower hive metastore versions:
> > > > > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2
> > > > > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1
> > > > > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0
> > > > > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6
> > > > > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2
> > > > > >
> > > > > > Users can choose one and download to flink/lib. It includes all
> > hive
> > > > > > things.
> > > > > >
> > > > > > I try to use a single module to deploy multiple versions, but I
> can
> > > not
> > > > > > find a suitable way, because different modules require different
> > > > versions
> > > > > > and different dependencies.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html
> > > > > >
> > > > > > Best,
> > > > > > Jingsong Lee
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


-- 
Best, Jingsong Lee

Reply via email to