Hi,
I am using apache-flink 1.19 and python 3.11. I have a very simple batch
job which registers a source table using CREATE TABLE … and output to a
sink table, another CREATE TABLE …. Before I output to sink table, I run a
dedup query
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY
nformation about s3 dataset sink
>
> Regards,
>
>
> Le jeu. 10 févr. 2022 à 20:52, Antonio Si a écrit :
>
>> Thanks Bastien. Can you point to an example of using a sink as we are
>> planning to write to S3?
>>
>> Thanks again for your help.
>>
>> Antoni
if it's too big, into something splittable
>
> Regards,
> Bastien
>
> --
>
> Bastien DINE
> Data Architect / Software Engineer / Sysadmin
> bastiendine.io
>
>
> Le jeu. 10 févr. 2022 à 20:32, Antonio Si a écrit :
>
>> Hi,
>>
>> I
Hi,
I am using the stateful processing api to read the states from a savepoint
file.
It works fine when the state size is small, but when the state size is
larger, around 11GB, I am getting an OOM. I think it happens when it is
doing a dataSource.collect() to obtain the states. The stackTrace is c