Re: Seeking suggestions for ingesting large amount of data from S3

2023-02-12 Thread yuxia
by memory lack or others. Btw, have ever tried with Flink SQL to ingeset the data. Does the OOM still happen? Best regards, Yuxia 发件人: "Yang Liu" 收件人: "User" 发送时间: 星期五, 2023年 2 月 10日 上午 5:10:49 主题: Seeking suggestions for ingesting large amount of data from S3 Hi

Seeking suggestions for ingesting large amount of data from S3

2023-02-09 Thread Yang Liu
Hi all, We are trying to ingest large amounts of data (20TB) from S3 using Flink filesystem connector to bootstrap a Hudi table. Data are well partitioned in S3 by date/time, but we have been facing OOM issues in Flink jobs, so we wanted to update the Flink job to ingest the data chunk by chuck (p