Thanks for the help, guys. I can work with that. Maybe it makes sense to add something like that to the parquet doc file: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/formats/parquet/
This documentation does not mention hadoop at all, and it seemed just as straightforward as other formats. regards, Frank On Wed, Feb 22, 2023 at 9:42 AM Martijn Visser <martijnvis...@apache.org> wrote: > Hi Frank, > > Parquet always requires Hadoop. There is a Parquet ticket to make it > possible to read/write Parquet without depending on Hadoop, but that's > still open. So in order for Flink to be able to work with Hadoop, it > requires the necessary Hadoop dependencies as outlined in > https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/configuration/advanced/#hadoop-dependencies. > When I made a recipe for writing Parquet, I needed to add at > least org.apache.hadoop:hadoop-common > and org.apache.hadoop:hadoop-mapreduce-client-core. > > Best regards, > > Martijn > > On Thu, Feb 9, 2023 at 10:07 AM Frank Lyaruu <flya...@gmail.com> wrote: > >> Hi all, I’m using the Flink k8s operator to run a SQL stream to/from >> various connectors, and just added a Parquet format. I customized the image >> a bit per the example (mostly by adding maven downloads of flink-connector* >> jars). If I do that for flink-parquet-1.16.1 it fails on missing >> org/apache/hadoop/conf/Configuration >> >> I started adding hadoop-common (which contains that class), but that one >> is huge and has a bunch of deps, even in that class, so that would be quite >> the rabbit hole. I see an old thread that seems very similar: >> https://www.mail-archive.com/user@flink.apache.org/msg43028.html but >> without any conclusion. >> >> How _is_ this supposed to work? The flink docs on the parquet format >> don't mention anything special: >> >> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/formats/parquet/ >> >> regards, Frank >> >> >>