IIRC, Guowei wants to work on supporting Table API connectors in Plugins.
With that, we could have the Hive dependency as a plugin, avoiding
dependency conflicts.
On Thu, Feb 6, 2020 at 1:11 PM Jingsong Li wrote:
> Hi Stephan,
>
> Good idea. Just like hadoop, we can have flink-shaded-hive-uber.
Hi Stephan,
Good idea. Just like hadoop, we can have flink-shaded-hive-uber.
Then the startup of hive integration will be very simple with one or two
pre-bundled, user just add these dependencies:
- flink-connector-hive.jar
- flink-shaded-hive-uber-.jar
Some changes are needed, but I think it sho
Hi Jingsong!
This sounds that with two pre-bundled versions (hive 1.2.1 and hive 2.3.6)
you can cover a lot of versions.
Would it make sense to add these to flink-shaded (with proper dependency
exclusions of unnecessary dependencies) and offer them as a download,
similar as we offer pre-shaded Ha
Hi Stephan,
The hive/lib/ has many jars, this lib is for execution, metastore, hive
client and all things.
What we really depend on is hive-exec.jar. (hive-metastore.jar is also
required in the low version hive)
And hive-exec.jar is a uber jar. We just want half classes of it. These
half classes a
Some thoughts about other options we have:
- Put fat/shaded jars for the common versions into "flink-shaded" and
offer them for download on the website, similar to pre-bundles Hadoop
versions.
- Look at the Presto code (Metastore protocol) and see if we can reuse
that
- Have a setup helper
Hi Stephan,
As Jingsong stated, in our documentation the recommended way to add Hive
deps is to use exactly what users have installed. It's just we ask users to
manually add those jars, instead of automatically find them based on env
variables. I prefer to keep it this way for a while, and see if
Hi all,
For your information, we have document the dependencies detailed
information [1]. I think it's a lot clearer than before, but it's worse
than presto and spark (they avoid or have built-in hive dependency).
I thought about Stephan's suggestion:
- The hive/lib has 200+ jars, but we only nee
We have had much trouble in the past from "too deep too custom"
integrations that everyone got out of the box, i.e., Hadoop.
Flink has has such a broad spectrum of use cases, if we have custom build
for every other framework in that spectrum, we'll be in trouble.
So I would also be -1 for custom b
Couldn't it simply be documented which jars are in the convenience jars
which are pre built and can be downloaded from the website? Then people who
need a custom version know which jars they need to provide to Flink?
Cheers,
Till
On Tue, Dec 17, 2019 at 6:49 PM Bowen Li wrote:
> I'm not sure pr
I'm not sure providing an uber jar would be possible.
Different from kafka and elasticsearch connector who have dependencies for
a specific kafka/elastic version, or the kafka universal connector that
provides good compatibilities, hive connector needs to deal with hive jars
in all 1.x, 2.x, 3.x v
Also -1 on separate builds.
After referencing some other BigData engines for distribution[1], i didn't find
strong needs to publish a separate build
for just a separate Hive version, indeed there are builds for different Hadoop
version.
Just like Seth and Aljoscha said, we could push a flink-hi
Thanks all for explaining.
I misunderstood the original proposal.
-1 to put them in our distributions
+1 to have provide hive uber jars as Seth and Aljoscha advice
Hive is just a connector no matter how important it is.
So I totally agree that we shouldn't put them in our distributions.
We can st
I agree with Seth and Aljoscha and think that is a right way to go.
We already provided uber jars for kafka and elasticsearch for out-of-box,
you can see the download links in this page[1].
Users can easily to download the connectors and versions they like and drag
to SQL CLI lib directories. The u
I was going to suggest the same thing as Seth. So yes, I’m against having Flink
distributions that contain Hive but for convenience downloads as we have for
Hadoop.
Best,
Aljoscha
> On 13. Dec 2019, at 18:04, Seth Wiesman wrote:
>
> I'm also -1 on separate builds.
>
> What about publishing c
I'm also -1 on separate builds.
What about publishing convenience jars that contain the dependencies for
each version? For example, there could be a flink-hive-1.2.1-uber.jar that
users could just add to their lib folder that contains all the necessary
dependencies to connect to that hive version.
I'm generally not opposed to convenience binaries, if a huge number of
people would benefit from them, and the overhead for the Flink project is
low. I did not see a huge demand for such binaries yet (neither for the
Flink + Hive integration). Looking at Apache Spark, they are also only
offering co
-1
We shouldn't need to deploy additional binaries to have a feature be
remotely usable.
This usually points to something else being done incorrectly.
If it is indeed such a hassle to setup hive on Flink, then my conclusion
would be that either
a) the documentation needs to be improved
b) th
Hi Bowen,
Thanks for driving this.
+1 for this proposal.
Due to our multi version support, users are required to rely on
different dependencies, it does break the "out of box" experience.
Now that the client has changed to go to child first class loader resolve
by default, it puts forward higher
Hi Bowen~
Thanks for driving on this. I have tried using sql client with hive connector
about two weeks ago, it’s painful to set up the environment from my experience.
+ 1 for this proposal.
Best,
Terry Wang
> 2019年12月13日 16:44,Bowen Li 写道:
>
> Hi all,
>
> I want to propose to have a coupl
+1, this is definitely necessary for better user experience. Setting up
environment is always painful for many big data tools.
Bowen Li 于2019年12月13日周五 下午5:02写道:
> cc user ML in case anyone want to chime in
>
> On Fri, Dec 13, 2019 at 00:44 Bowen Li wrote:
>
>> Hi all,
>>
>> I want to propose
cc user ML in case anyone want to chime in
On Fri, Dec 13, 2019 at 00:44 Bowen Li wrote:
> Hi all,
>
> I want to propose to have a couple separate Flink distributions with Hive
> dependencies on specific Hive versions (2.3.4 and 1.2.1). The distributions
> will be provided to users on Flink down
Hi all,
I want to propose to have a couple separate Flink distributions with Hive
dependencies on specific Hive versions (2.3.4 and 1.2.1). The distributions
will be provided to users on Flink download page [1].
A few reasons to do this:
1) Flink-Hive integration is important to many many Flink
22 matches
Mail list logo