Thanks for the help, guys. I can work with that.
Maybe it makes sense to add something like that to the parquet doc file:
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/formats/parquet/

This documentation does not mention hadoop at all, and it seemed just as
straightforward as other formats.

regards, Frank


On Wed, Feb 22, 2023 at 9:42 AM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Frank,
>
> Parquet always requires Hadoop. There is a Parquet ticket to make it
> possible to read/write Parquet without depending on Hadoop, but that's
> still open. So in order for Flink to be able to work with Hadoop, it
> requires the necessary Hadoop dependencies as outlined in
> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/configuration/advanced/#hadoop-dependencies.
> When I made a recipe for writing Parquet, I needed to add at
> least org.apache.hadoop:hadoop-common
> and org.apache.hadoop:hadoop-mapreduce-client-core.
>
> Best regards,
>
> Martijn
>
> On Thu, Feb 9, 2023 at 10:07 AM Frank Lyaruu <flya...@gmail.com> wrote:
>
>> Hi all, I’m using the Flink k8s operator to run a SQL stream to/from
>> various connectors, and just added a Parquet format. I customized the image
>> a bit per the example (mostly by adding maven downloads of flink-connector*
>> jars). If I do that for flink-parquet-1.16.1 it fails on missing
>> org/apache/hadoop/conf/Configuration
>>
>> I started adding hadoop-common (which contains that class), but that one
>> is huge and has a bunch of deps, even in that class, so that would be quite
>> the rabbit hole. I see an old thread that seems very similar:
>> https://www.mail-archive.com/user@flink.apache.org/msg43028.html but
>> without any conclusion.
>>
>> How _is_ this supposed to work? The flink docs on the parquet format
>> don't mention anything special:
>>
>> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/formats/parquet/
>>
>> regards, Frank
>>
>>
>>

Reply via email to