Thanks Bastien. I will check it out.
Antonio.
On Thu, Feb 10, 2022 at 11:59 AM bastien dine
wrote:
> I haven't used s3 with Flink, but according to this doc :
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/filesystems/s3/
> You can setup pretty easily s3 and use it
I haven't used s3 with Flink, but according to this doc :
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/filesystems/s3/
You can setup pretty easily s3 and use it with s3://path/to/your/file with
a write sink
The page talk about DataStream but it should work with DataSet
Thanks Bastien. Can you point to an example of using a sink as we are
planning to write to S3?
Thanks again for your help.
Antonio.
On Thu, Feb 10, 2022 at 11:49 AM bastien dine
wrote:
> Hello Antonio,
>
> .collect() method should be use with caution as it's collecting the
> DataSet (multiple
Hello Antonio,
.collect() method should be use with caution as it's collecting the DataSet
(multiple partitions on multiple TM) into a List single list on JM (so in
memory)
Unless you have a lot of RAM, you can not use it this way and you probably
should not
I recommend you to use a sink to print
Hi,
I am using the stateful processing api to read the states from a savepoint
file.
It works fine when the state size is small, but when the state size is
larger, around 11GB, I am getting an OOM. I think it happens when it is
doing a dataSource.collect() to obtain the states. The stackTrace is c