Re: AvroParquetWriter issues writing to S3

Arvid Heise Fri, 17 Apr 2020 03:39:13 -0700

Hi Diogo,

I saw similar issues already. The root cause is always users actually not
using any Flink specific stuff, but going to the Parquet Writer of Hadoop
directly. As you can see in your stacktrace, there is not one reference to
any Flink class.


The solution usually is to use the respective Flink sink instead of
bypassing them [1].
If you opt to implement it manually nonetheless, it's probably easier to
bundle Hadoop from a non-Flink dependency.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/connectors/streamfile_sink.html

On Thu, Apr 16, 2020 at 5:36 PM Diogo Santos <diogodssan...@gmail.com>
wrote:

> Hi Till,
>
> definitely seems to be a strange issue. The first time the job is loaded
> (with a clean instance of the Cluster) the job goes well, but if it is
> canceled or started again the issue came.
>
> I built an example here https://github.com/congd123/flink-s3-example
>
> You can generate the artifact of the Flink Job and start the cluster with
> the configuration on the docker-compose.
>
> Thanks for helping
>
>
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Re: AvroParquetWriter issues writing to S3

Reply via email to